A non-indexed document is not a lost document.

It is preserved, physical, real. It occupies space on a shelf or a server. It can be consulted if one knows it exists and where it is located. But for a system that does not know it, it does not exist as a mobilizable object. It is present without being accessible. It exists without being findable.

The distinction between physical existence and operative existence is the subject of this dossier.

A large library preserves millions of documents. No researcher can consult them all. What becomes truly accessible is the index. The index decides what is findable. What is findable decides what can be cited, mobilized, compared, refuted. A non-indexed document is preserved but inoperative. It does not enter into chains of proof, bibliographies, arguments. It produces no operative consequences. For research, it does not exist.

This is not an exceptional failure. This is the normal structure of any documentary system.

Medieval library catalogs indexed by title, author, sometimes by subject. But the subjects were the cataloger's categories, not those of the document. A fluid mechanics treatise filed under "natural philosophy" could become unfindable for a reader searching for flow, pressure or fluid resistance. The catalog did not only reflect content. It reflected the categories available at the time of indexation. These categories filtered what could be found, and by whom.

The Index librorum prohibitorum, maintained by the Congregation of the Index of the Catholic Church from 1559 to 1966, operated according to the same logic in a reverse direction: index to prohibit, render visible to render inaccessible. A book inscribed in the Index existed as an object of prohibition. It did not exist as a legitimate source. Negative indexation is also an operation of treatability: it converts a physical presence into operative status, here that of the forbidden.

A search engine does not give access to the web. It gives access to its index of the web.

What it has not explored, what a robots.txt file has excluded, what a dynamic structure has not made crawlable, what a form encloses, what a URL isolates, does not immediately belong to the common space of the findable. This content can be public, legal, hosted, technically present. For the vast majority of users, it does not exist. They cannot find it without knowing precisely where to search. Its physical presence on servers confers no operative existence in the search space.

A search engine's index is not an exhaustive list of what exists. It is the result of a partial exploration process, constrained by finite computational resources, crawl priorities, technical and legal exclusion rules, and relevance criteria that change with algorithms. At each cycle, the system decides what enters the space of the findable and what remains outside.

PageRank, Google's original algorithm, evaluated a page's importance by the number and quality of incoming links. A page without incoming links cannot be found by ordinary exploration. It can exist if its URL is known or directly submitted, but it does not benefit from the same discovery regime. The indexation criterion and the relevance criterion mutually reinforce each other: what is already known is more easily found; what is more easily found is more often cited; what is more often cited becomes more easily findable.

The index amplifies what is already in the index.

This circularity is not arbitrary. It comes from an operative decision about what the importance of a document means: importance is derived from links, links are human citation decisions, and citation decisions depend on what was findable previously. The criterion becomes endogenous to the system it measures.

PubMed indexes several tens of millions of bibliographic references in biomedical sciences. A clinical study published in a non-indexed journal can be rigorous, relevant, replicable. It exists in the literature in the material sense. But it does not enter into the same chains of systematic review, automated meta-analysis, clinical recommendation or evidence-based medicine. It has not disappeared. It is not operative in the same place.

Indexation policy is therefore an epistemic policy. It decides what can count as proof in the chain that goes from clinical study to therapeutic recommendation. A non-indexed journal produces studies that exist, but that enter with difficulty into the devices where practices are aggregated, compared and decided.

The index is not a list of what exists.

It is the condition of what can count.

Doctrine

The index is a threshold operation applied to a collection.

It converts a physical presence into operative accessibility. What passes the threshold of indexation becomes a mobilizable object in the chains of proofs, citations, arguments, decisions. What does not pass remains in the off-field of documentary treatability.

Indexation policy is an epistemic policy. It decides what can count as source, proof, reference, authority. It is not neutral, because it is not exhaustive. All indexation operates by selection, and all selection produces an off-field.

This off-field is not empty. It is populated with real documents, rigorous studies, existing texts. It is simply inaccessible to systems that operate from the index. For these systems, it does not exist.

Open vector

Recommendation algorithms are dynamic and personalized indexes. They do not index what exists in a common space. They index what corresponds to an inferred profile, a history, clicks, a location, an access time, an engagement probability. Two people searching for the same term can receive two different findability spaces. The index becomes personal.

What each person can find depends on what the system has decided they are looking for. The space of the findable is no longer common. It is individualized, opaque to the user, revisable at each query. Accessible reality becomes an unstable operative object, different according to who searches and when.

The off-field of this personalized index is not a shared off-field. It is specific to each user. What one cannot find, the other can find. What does not exist for one exists for the other. The index fragments the space of the findable into as many spaces as there are profiles.

When two people can no longer find the same things, can they still share the same proofs?

References

A. Lynge Internal Archives