We do not store any original document. Any data we store goes through a process of anonymization and atomization. This can be thought of as a digital form of tippexing sensitive information, shredding the document and then shuffling the pieces. Reconstructing an original document from these chunks is therefore not possible.
The shredded pieces of data are used to aggregate statistics on what words appear next to each other in order to learn the general structure of legal language and sentences. This general language model is used as a baseline in our AI before we tailor it to the specific tasks and needs of your firm. The general language model is reused between different clients as it requires too much text for any single firm to produce. This data is not seen as sensitive, since it only broadly describes the properties of legal language. For instance Donna knows that the words “shall” and “provide” often appear together but not how they relate to any particular person or company.