Working with Stores

Modified on Tue, 06 Sep 2022 at 02:42 AM

The concept of a store is a way for the Kodexa platform to hold information.

There are two types of store that you can create:

  • Model stores hold the implementation, training and metadata for a model that can be executed on a Model Runtime
  • Document stores hold files (and their associated document representations)
  • Data stores hold the extracted data objects and attribute that have been identified in documents held in a document store.

At a high-level the general design of stores is to hold native files,  associated “Document” representations of the unstructured data. Then through the definition of a Data Structure we would add labels to these documents.  Then the platform is able to extract the labeled data into a structured form.

Access a Store

Just like all components within Kodexa a store have a reference, the reference to the stores can be used with a Kodexa client. 

from kodexa import * 
client = KodexaClient() 

store_endpoint = client.get_object_by_ref('store','example-org/1231454123-processing')  

This method will return an instance of a store endpoint, this endpoint will be based on the type of store that you referenced. In the follow sections we will explore a little about the 3 types of store.

Document Stores 

A document store is one that is responsible for holding files that will be parsed, labeled and used as a source for structured data.

The actual term document refers to the fact that when you upload a file (like a PDF) we will actually create a container that will hold both the original file (we call it the native) and then one of more Kodexa Documents that will hold the semi-structured representation of the native file.

These containers of native files and documents are called Document Families.  The are brought together since all the documents are representations of the original file, and we support holding multiple documents since models (or humans) can label documents independently.

Data Stores

A data store is designed to hold structured data that has been extracted from a set of labeled documents that are held in a document store.

A data store is linked to a Data Structure (internally called a Taxonomy). The Data Structure formalizes the structure of the data into groups and individual data attributes,  then the actual data points and their related groups are created in the data store (with lineage back to the document store holding the document representation).

Model Stores

A model store is a very different type of store, while it does hold native files (like a document store), it is designed to support holding the implementation and trained representations of a machine-learning model.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article