Home > Hacking Invenio > Websubmit_file_metadata APIs and Plugin Development |
The websubmit_file_metadata library enables extraction and update of file metadata.
It can be called from Python sources or run from the command line.
The library can be extended to support various formats thanks to plugins (which must be dropped in /opt/invenio/lib/python/invenio/websubmit_file_metadata_plugins/
directory).
Two main functions can be imported from websubmit_file_metadata:
def read_metadata(inputfile, force=None, remote=False, loginpw=None, verbose=0):Returns metadata extracted from given file as dictionary. Availability depends on input file format and installed plugins (return TypeError if unsupported file format). Parameters: * inputfile (string) - path to a file * force (string) - name of plugin to use, to skip plugin auto-discovery * remote (boolean) - if the file is accessed remotely or not * loginpw (string) - credentials to access secure servers (username:password) * verbose (int) - verbosity Returns: dict dictionary of metadata tags as keys, and (interpreted) value as value Raises: * TypeError - if file format is not supported. * RuntimeError - if required library to process file is missing. * InvenioWebSubmitFileMetadataRuntimeError - when metadata cannot be read.def write_metadata(inputfile, outputfile, metadata_dictionary, force=None, verbose=0):
Writes metadata to given file. Availability depends on input file format and installed plugins (return TypeError if unsupported file format). Parameters: * inputfile (string) - path to a file * outputfile (string) - path to the resulting file. * metadata_dictionary (dict) - keys and values of metadata to update. * force (string) - name of plugin to use, to skip plugin auto-discovery * verbose (int) - verbosity Returns: string output of the plugin Raises: * TypeError - if file format is not supported. * RuntimeError - if required library to process file is missing. * InvenioWebSubmitFileMetadataRuntimeError - when metadata cannot be updated.
You can develop new plugins to extend the compatibility of the library with additional file formats.
Your plugin name must start with "wsm_
" and end with ".py
". For eg. wsm_myplugin.py
.
Once ready, it must be dropped into /opt/invenio/lib/python/invenio/websubmit_file_metadata_plugins/
directory.
Your plugin can define the following interface:
The
functions can_read_local(..)
, can_read_remote(..)
,
and can_write_local(..)
are called at runtime by the
library on all installed plugin to check which one can process the
given file for the given action. If one of these functions return
true, your plugin will be selected to process the file. You can omit
one or several of these functions (for eg. if you don't support
reading from remote server, simply omit can_read_remote(..)
).
If your plugin returned True
for a given action, the
corresponding
function read_metadata_local(..)
, read_metadata_remote(..)
or write_metadata_local(..)
is then called. You must therefore implement the corresponding function (for eg. if you return True
for some file with can_write_local(..)
, then you must implement write_metadata_local(..)
).
Your plugin code should also define
the __required_plugin_API_version__
variable, to define
the interface version your plugin is compatible with. For
eg. set __required_plugin_API_version__ = "WebSubmit File Metadata Plugin API 1.0"
Returns True if file can be processed by this plugin. Parameters: * inputfile (string) - path to a file to read metadata from Returns: boolean True if file can be processeddef can_read_remote(inputfile):
Returns True if file at remote location can be processed by this plugin. Parameters: * inputfile (string) - URL to a file to read metadata from Returns: boolean True if file can be processeddef can_write_local(inputfile):
Returns True if file can be processed by this plugin for writing. Parameters: * inputfile (string) - path to a file to update metadata Returns: boolean True if file can be processeddef read_metadata_local(inputfile, verbose):
Returns a dictionary of metadata read from inputfile. Parameters: * inputfile (string) - path to file to read from * verbose (int) - verbosity Returns: dict dictionary with metadatadef read_metadata_remote(inputfile, verbose):
Returns a dictionary of metadata read from remote inputfile. Parameters: * inputfile (string) - URL to file to read from * verbose (int) - verbosity Returns: dict dictionary with metadatawrite_metadata_local(inputfile, verbose):
Update metadata of given inputfile. Parameters: * inputfile (string) - path to file to update * verbose (int) - verbosity Returns: dict dictionary with metadata
If your plugin depends on some other
external library, you should check that this library is installed at load
time (that is in the main scope of the plugin). If the library is
missing, it should raise an ImportError
exception. For example:
""" WebSubmit Metadata Plugin - My custom plugin Dependencies: extractor """ __plugin_version__ = "WebSubmit File Metadata Plugin API 1.0" import extractor def can_read_local(inputfile): [...]The
import extractor
will generate
such ImportError
exception if extractor
is
missing.
If your plugin can read the same file type as other installed
plugins, the system will combine the information returned by all
compatible plugins in a single dictionary, so that there is no
conflict.
The behaviour is different when writing to a file: in that case
the first library found is used to update the metadata of a file.
There is no way for the developer to prioritize libraries. Only
the user can specify the --force
option to select
a given library.