java.lang.Object | |
↳ | com.pspdfkit.document.library.PdfLibrary |
PdfLibrary implements a SQLite-based full-text-search engine. You can register documents to be indexed in the background and then search for keywords within that collection. There can be multiple libraries, although usually one is enough for the common use case.
Nested Classes | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
@interface | PdfLibrary.Tokenizer |
Constants | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
String | PORTER_TOKENIZER | The name of PSPDFKit's custom porter tokenizer that allows better CJK indexing. | |||||||||
String | UNICODE_TOKENIZER | The name of PSPDFKit's custom Unicode tokenizer. |
Public Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
void |
addLibraryIndexingListener(LibraryIndexingListener listener)
Adds a
LibraryIndexingListener to monitor document indexing status. | ||||||||||
void |
clearIndex()
Completely clears the index for this library.
| ||||||||||
void |
enqueueDocumentSources(List<DocumentSource> documentSources)
Queues an array of documents for indexing.
| ||||||||||
void |
enqueueDocumentSourcesWithMetadata(List<Pair<DocumentSource, byte[]>> documentSources)
Queues an array of documents for indexing together with passed free-form metadata.
| ||||||||||
void |
enqueueDocuments(List<PdfDocument> documents, IndexingOptions indexingOptions)
Queues an array of documents for indexing.
| ||||||||||
void |
enqueueDocuments(List<PdfDocument> documents)
Queues an array of documents for indexing.
| ||||||||||
void |
enqueueDocumentsWithMetadata(List<Pair<PdfDocument, byte[]>> documents)
Queues an array of documents for indexing together with passed free-form metadata.
| ||||||||||
void |
enqueueDocumentsWithMetadata(List<Pair<PdfDocument, byte[]>> documents, IndexingOptions indexingOptions)
Queues an array of documents for indexing together with passed free-form metadata.
| ||||||||||
static PdfLibrary |
get(String path, String tokenizer)
Returns a library for a given path.
| ||||||||||
static PdfLibrary |
get(String path)
Returns a library for a given path.
| ||||||||||
LibraryIndexStatus |
getIndexStatusForUID(String uid)
Returns indexing status for a document with passed UID.
| ||||||||||
List<String> |
getIndexedUIDs()
Returns list of UIDs of documents currently indexed.*
| ||||||||||
byte[] |
getMetadataForUID(String uid)
Returns metadata appended to document with
enqueueDocumentsWithMetadata(List) call. | ||||||||||
List<String> |
getQueuedUIDs()
Returns list of UIDs of documents queued for indexing.
| ||||||||||
boolean |
getSaveReverseText()
Indicates whether saving the reverse text is enabled.
| ||||||||||
boolean |
isIndexing()
Indicates whether the indexing is in progress or not.
| ||||||||||
void |
removeDocuments(List<String> documentUIDs)
Invalidates index for documents.
| ||||||||||
void |
removeLibraryIndexingListener(LibraryIndexingListener listener)
Removes a registered
LibraryIndexingListener added with addLibraryIndexingListener(LibraryIndexingListener) . | ||||||||||
void |
search(String searchString, QueryOptions options, QueryResultListener resultListener)
Query the database for a match of searchString.
| ||||||||||
void |
setSaveReverseText(boolean saveReverseText)
Will save a reversed copy of the original page text.
| ||||||||||
int |
size()
Returns number of indexed documents in this library.
| ||||||||||
void |
stopSearch()
Stops search and all in-progress preview text generator tasks.
|
[Expand]
Inherited Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
![]() |
The name of PSPDFKit's custom porter tokenizer that allows better CJK indexing. This
tokenizer also comes with a few drawbacks, like much more lax matching of words (Searching
for "Dependency" will also return "Dependencies"), if this is a problem use the UNICODE_TOKENIZER
instead.
This is the default tokenizer used when no other one is specified.
The name of PSPDFKit's custom Unicode tokenizer. This tokenizer wraps around SQLite's unicode61
tokenizer to add full case folding to the indexed text.
Warning: This tokenizer is only available when the library supports FTS5, otherwise
specifying this as the value for the tokenizer
parameter will result in an error when
trying to create the library.
Adds a LibraryIndexingListener
to monitor document indexing status. If the listener
has already been added previously, this method will be a no-op. Adding null
is not
allowed, and will result in an exception.
listener | LibraryIndexingListener that should be notified. Must be non-null. |
---|
Completely clears the index for this library.
Queues an array of documents for indexing. Any documents already queued or fully indexed will
be ignored. This call will avoid opening documents until they're indexed and it's thus
significantly more memory friendly than enqueueDocuments(List)
.
documentSources | List of document sources to index. |
---|
Queues an array of documents for indexing together with passed free-form metadata. This call
will avoid opening documents until they're indexed and it's thus significantly more memory
friendly than enqueueDocumentsWithMetadata(List)
.
Metadata can be retrieved after indexing with getMetadataForUID(String)
method
call.
Any documents already queued or fully indexed will be ignored.
documentSources | List of document sources to index. |
---|
Queues an array of documents for indexing. Any documents already queued or fully indexed will be ignored.
documents | List of documents to index. |
---|---|
indexingOptions | Options for indexing the given documents. |
Queues an array of documents for indexing. Any documents already queued or fully indexed will be ignored.
NOTE: This call requires all documents to be opened when indexing and will most
likely lead to out of memory conditions if a lot of documents are passed. Prefer to use
enqueueDocumentSources(List)
if possible!
documents | List of documents to index. |
---|
Queues an array of documents for indexing together with passed free-form metadata. Metadata
can be retrieved after indexing with getMetadataForUID(String)
method call.
NOTE: This call requires all documents to be opened when indexing and will most
likely lead to out of memory conditions if a lot of documents are passed. Prefer to use
enqueueDocumentSources(List)
if possible!
Any documents already queued or fully indexed will be ignored.
documents | List of documents to index with metadata to be stored. |
---|
Queues an array of documents for indexing together with passed free-form metadata. Metadata
can be retrieved after indexing with getMetadataForUID(String)
method call.
Any documents already queued or fully indexed will be ignored.
documents | List of documents to index with metadata to be stored. |
---|---|
indexingOptions | Options for indexing the given documents. |
Returns a library for a given path. If no library exists for this path yet, this method will create and return one.
path | Writable path to library database file. |
---|---|
tokenizer | The tokenizer to use, one of PORTER_TOKENIZER or UNICODE_TOKENIZER . This controls how the PdfLibrary matches queries to the
content in the index. |
IOException | if file could not be written. |
---|
Returns a library for a given path. If no library exists for this path yet, this method will create and return one.
path | Writable path to library database file. |
---|
IOException | if file could not be written. |
---|
Returns indexing status for a document with passed UID.
uid | UID of the document |
---|
Returns list of UIDs of documents currently indexed.*
Returns metadata appended to document with enqueueDocumentsWithMetadata(List)
call.
uid | UID of the document. |
---|
Returns list of UIDs of documents queued for indexing.
Indicates whether saving the reverse text is enabled.
true
if saving reverse text is enabled, false
otherwise.Indicates whether the indexing is in progress or not.
true
if indexing is in progress, false
otherwise.
Invalidates index for documents.
documentUIDs | List of document UIDs to be invalidated. |
---|
Removes a registered LibraryIndexingListener
added with addLibraryIndexingListener(LibraryIndexingListener)
. Upon calling this method the listener
will no longer be notified of any changes. If the listener has not been added, this
method will be a no-op. Adding null
is not allowed,and will result in an exception.
listener | LibraryIndexingListener that should be removed. Must be non-null. |
---|
Query the database for a match of searchString. Only direct matches, begins-with and ends-with matches are supported. Returns a map of document UIDs to set of pages matching inside that document.
searchString | String to search for. |
---|---|
options | Options object determining search behaviour. May be null for default
behaviour. |
resultListener | Callback listener which will be called with search results. Note that the methods in the listener will be called on the background thread. |
Will save a reversed copy of the original page text. If enabled the index database will be about 2x bigger, but ends-with matches will be enabled.
saveReverseText | true to save reversed text to index, true by default.
|
---|
Returns number of indexed documents in this library.
Stops search and all in-progress preview text generator tasks.