Welcome to Pacifica Policy’s documentation!¶
The Pacifica Policy service provides endpoints that define policy questions for institutions. This is separate from other services as certain operations required by other Pacifica Core services are more Policy base.
Practially speaking, when the question a Pacifica service wants to ask the Metadata service is sufficiently complex it should really be a Policy question. For example, when uploading data the ingest service needs to validate the metadata requesting to be added. This new metadata needs to be verified by some institutional requirements. So there is a Policy endpoint (several actually) that help ensure those requirements are met.
Installation¶
The Pacifica software is available through PyPi so creating a virtual environment to install is what is shown below. Please keep in mind compatibility with the Pacifica Core services.
Installation in Virtual Environment¶
These installation instructions are intended to work on both Windows, Linux, and Mac platforms. Please keep that in mind when following the instructions.
Please install the appropriate tested version of Python for maximum chance of success.
Linux and Mac Installation¶
mkdir ~/.virtualenvs
python -m virtualenv ~/.virtualenvs/pacifica
. ~/.virtualenvs/pacifica/bin/activate
pip install pacifica-policy
Windows Installation¶
This is done using PowerShell. Please do not use Batch Command.
mkdir "$Env:LOCALAPPDATA\virtualenvs"
python.exe -m virtualenv "$Env:LOCALAPPDATA\virtualenvs\pacifica"
& "$Env:LOCALAPPDATA\virtualenvs\pacifica\Scripts\activate.ps1"
pip install pacifica-policy
Configuration¶
The Pacifica Core services require two configuration files. The REST API utilizes CherryPy and review of their configuration documentation is recommended. The service configuration file is a INI formatted file containing configuration for database connections.
CherryPy Configuration File¶
An example of Policy server CherryPy configuration:
[global]
log.screen: True
log.access_file: 'access.log'
log.error_file: 'error.log'
server.socket_host: '0.0.0.0'
server.socket_port: 8181
[/]
request.dispatch: cherrypy.dispatch.MethodDispatcher()
tools.response_headers.on: True
tools.response_headers.headers: [('Content-Type', 'application/json')]
Service Configuration File¶
The service configuration is an INI file and an example is as follows:
[policy]
; This section has policy service specific config options
; The following strings reference formatting directives {}. The
; object passed to the format method is the transaction object
; from the metadata API. The DOI is special and added into the
; transaction object for that format as well.
; Internal URL format for transactions not released or have DOIs
internal_url_format = https://internal.example.com/{_id}
; Release URL format for transactions released but no DOI
release_url_format = https://release.example.com/{_id}
; DOI URL format for transactions with a DOI
doi_url_format = https://dx.doi.org/{doi}
; In memory object cache size (used in data release)
cache_size = 10000
; This sets the admin group name
admin_group = admin
; This sets the admin group id (should match group name in metadata)
admin_group_id = 0
; This sets the admin user id (should match user name in metadata)
admin_user_id = 0
[metadata]
; This section contains configuration for metadata service
; The global metadata url
endpoint_url = http://localhost:8121
; The endpoint to check for status of metadata service
status_url = http://localhost:8121/groups
[elasticsearch]
; This section describes configuration to contact elasticsearch
; URL to the elasticsearch server
url = http://127.0.0.1:9200
; URL to the elasticsearch server
index = pacifica_search
Starting the Service¶
Starting the Policy service can be done by two methods. However, understanding the requirements and how they apply to REST services is important to address as well. Using the internal CherryPy server to start the service is recommended for Windows platforms. For Linux/Mac platforms it is recommended to deploy the service with uWSGI.
Deployment Considerations¶
The Policy server can have the same memory consumption issues as the Metadata service. Please consider those recommendations here similarly for the Policy service.
CherryPy Server¶
To make running the Policy service using the CherryPy’s builtin server easier we have a command line entry point.
$ pacifica-policy --help
usage: pacifica-policy [-h] [-c CONFIG] [-p PORT] [-a ADDRESS]
Run the policy server.
optional arguments:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
cherrypy config file
-p PORT, --port PORT port to listen on
-a ADDRESS, --address ADDRESS
address to listen on
$ pacifica-policy
[09/Jan/2019:09:17:26] ENGINE Listening for SIGTERM.
[09/Jan/2019:09:17:26] ENGINE Bus STARTING
[09/Jan/2019:09:17:26] ENGINE Set handler for console events.
[09/Jan/2019:09:17:26] ENGINE Started monitor thread 'Autoreloader'.
[09/Jan/2019:09:17:26] ENGINE Serving on http://0.0.0.0:8181
[09/Jan/2019:09:17:26] ENGINE Bus STARTED
uWSGI Server¶
To make running the Policy service using uWSGI easier we have a module to be included as part of the uWSGI configuration. uWSGI is very configurable and can use this module many different ways. Please consult the uWSGI Configuration documentation for more complicated deployments.
$ pip install uwsgi
$ uwsgi --http-socket :8181 --master --module pacifica.policy.wsgi
Example Usage¶
The usage of the Policy API is strictly a read-only interface. The command line usage of the system provides tools to update data in the metadata API and are mostly cronjob like processes.
The API¶
The policy server is split up into endpoints named for their Pacifica
project that utilizes them. So the path /uploader
is used by the
Pacifica Uploader (http://github.com/pacifica/pacifica-uploader) to
control its behavior. The idea is that workflow implemented by the
various Pacifica projects has some element of site or instance
specific policy that can be applied to the running service. The policy
is driven by the metadata and thus this project should talk to the
metadata service.
Events API¶
The Events API is used by the Notifications service. The role of this query is to verify the event recieved by the Notifications services is allowed to be sent to the user on the URL path.
Request Example:
POST /events/dmlb2001
Content-Type: application/json
{
"data": [
...
]
}
Good Response Example:
Http-Code: 200
{
"status": "success"
}
Failed Response Example:
Http-Code: 401
{
"error": "..."
}
The underlying logic for this implementation is the same as the ingest endpoint discussed next.
Ingest API¶
The Ingest API is used by the Ingest service. This endpoint verifies the relationships between user, project and instrument before allowing an upload. The content of the body document is defined by the uploader.
Request Example:
POST /ingest
Content-Type: application/json
[
...
]
Good Response Example:
Http-Code: 200
{
"status": "success"
}
Failed Response Example:
Http-Code: 401
{
"error": "..."
}
Reporting and Status API¶
This document is not going into details about these APIs currently. These endpoints are supposed to be used by tools that provide status of current uploads to users of Pacifica as well as institutional reporting tools that aggregate metrics about uploads in Pacifica. Eventually, Pacifica should have a basic set of these websites to allow users to use these endpoints but not currently.
Uploader API¶
The Uploader API is a simple query interface to get complex metadata interactively while users are using the Uploader. This API has a JSON document that looks very SQL like but is not complete.
Request Example:
POST /uploader
Content-Type: application/json
{
"user": 100,
"from": "instruments",
"columns": [
"_id",
"name"
],
"where": {
"_id": 54
}
}
Good Response Example:
Http-Code: 200
[
{
"_id": 54,
"name": "NMR PROBES: Nittany Liquid"
}
]
Failed Response Example:
Http-Code: 500
Admin Command Line¶
There is a single admin command line tool (pacifica-policy-cmd
)
with two subcommands, data_release
and searchsync
. The
data_release
subcommand handles setting the data_release
attributes of the Projects and Transactions. The searchsync
subcommand handles formatting and synchonizing metadata to
ElasticSearch.
$ pacifica-policy-cmd --help
usage: pacifica-policy-cmd [-h] [--verbose] {data_release,searchsync} ...
positional arguments:
{data_release,searchsync}
sub-command help
data_release data_release help
searchsync searchsync help
optional arguments:
-h, --help show this help message and exit
--verbose enable verbose debug output
Data Release¶
The data release process involves two phases, updating the suspense date and setting data release. The suspense date is a date that the metadata and data associated with that object in metadata will be released in the future. The data release phase checks the suspense date with now to determine if the object needs to have it released.
$ pacifica-policy-cmd data_release --help
usage: pacifica-policy-cmd data_release [-h]
[--exclude [EXCLUDE [EXCLUDE ...]]]
[--keyword KEYWORD]
[--time-after TIME_AFTER]
[--time-ago TIME_AGO]
data release by policy
optional arguments:
-h, --help show this help message and exit
--exclude [EXCLUDE [EXCLUDE ...]]
id of keyword prefix to exclude.
--keyword KEYWORD keyword one of projects.actual_end_date,
projects.actual_start_date, projects.submitted_date,
projects.accepted_date, projects.closed_date,
transactions.created, transactions.updated.
--time-after TIME_AFTER
set suspense date on data to X days after keyword.
--time-ago TIME_AGO only objects updated after X days ago.
Search Sync¶
The search synchronization to Elasticsearch is driven by the Policy service. The metadata in Elasticsearch is meant to be consumed by client applications and in order to be performant those clients should communicate directly with Elasticsearch. This does mean that the metadata in Elasticsearch is not as current as the Metadata API.
$ pacifica-policy-cmd searchsync
usage: pacifica-policy-cmd searchsync [-h] [--objects-per-page ITEMS_PER_PAGE]
[--threads THREADS]
[--time-ago TIME_AGO]
sync sql data to elastic for search
optional arguments:
-h, --help show this help message and exit
--objects-per-page ITEMS_PER_PAGE
objects per bulk upload.
--threads THREADS number of threads to sync data
--time-ago TIME_AGO only objects newer than X days ago.
Policy Python Module¶
Events Python Module¶
Events module to drive policy for who can see events.
Events rest module for the cherrypy endpoint.
Ingest Python Module¶
Ingest valication module.
The CherryPy rest object for the structure.
Below is an example post body:
[
{"destinationTable": "Transactions._id", "value": 1234},
{"destinationTable": "Transactions.submitter", "value": 34002},
{"destinationTable": "Transactions.project", "value": "34002"},
{"destinationTable": "Transactions.instrument", "value": 34002},
{"destinationTable": "TransactionKeyValue", "key": "Tag", "value": "Blah"},
{"destinationTable": "TransactionKeyValue", "key": "Taggy", "value": "Blah"},
{"destinationTable": "TransactionKeyValue", "key": "Taggier", "value": "Blah"}
{
"destinationTable": "Files",
"_id": 34, "name": "foo.txt", "subdir": "a/b/",
"ctime": "Tue Nov 29 14:09:05 PST 2016",
"mtime": "Tue Nov 29 14:09:05 PST 2016",
"size": 128, "mimetype": "text/plain"
},
{
"destinationTable": "Files",
"_id": 35, "name": "bar.txt", "subdir": "a/b/",
"ctime": "Tue Nov 29 14:09:05 PST 2016",
"mtime": "Tue Nov 29 14:09:05 PST 2016",
"size": 47, "mimetype": "text/plain"
},
]
Reporting Python Module¶
CherryPy Uploader Policy object class.
CherryPy Uploader Policy object class.
CherryPy Status Metadata projectinfo base class.
-
class
pacifica.policy.reporting.transaction.query_base.
QueryBase
[source]¶ Formats summary data for other classes down the tree.
-
static
_merge_two_dicts
(dict_a, dict_b)[source]¶ Given two dicts, merge them into a new dict as a shallow copy.
-
base_user_info
= {'emsl_employee': False, 'instrument_list': [], 'project_list': []}¶
-
static
CherryPy Status Metadata object class.
-
class
pacifica.policy.reporting.transaction.transaction_details.
TransactionDetails
[source]¶ Retrieves a list of all transactions matching the search criteria.
-
exposed
= True¶
-
CherryPy Status Metadata object class.
-
class
pacifica.policy.reporting.transaction.transaction_summary.
TransactionSummary
[source]¶ Retrieves a summary of all transactions matching the search criteria.
-
static
POST
(time_basis=None, object_type=None, start_date=None, end_date=None, **kwargs)[source]¶ CherryPy GET method.
-
static
_get_transaction_list_summary
(time_basis, object_list, object_type, start_date, end_date, user_id)[source]¶
-
exposed
= True¶
-
static
The CherryPy rest object for the structure.
Status Python Module¶
CherryPy Uploader Policy object class.
Base class module for standard queries for the upload status tool.
-
class
pacifica.policy.status.base.
QueryBase
[source]¶ This pulls the common bits of instrument and project query into a single class.
-
all_instruments_url
= 'http://localhost:8121/instruments'¶
-
all_projects_url
= 'http://localhost:8121/projects'¶
-
all_transactions_url
= 'http://localhost:8121/transactions'¶
-
md_url
= 'http://localhost:8121'¶
-
CherryPy Status Policy object class.
-
class
pacifica.policy.status.instrument_query.
InstrumentQuery
[source]¶ CherryPy root object class.
-
exposed
= False¶
-
CherryPy Status Policy object class.
-
class
pacifica.policy.status.project_query.
ProjectQuery
[source]¶ CherryPy root object class.
-
exposed
= False¶
-
The CherryPy rest object for the structure.
-
class
pacifica.policy.status.rest.
StatusPolicy
[source]¶ CherryPy root object class.
not exposed by default the base objects are exposed.
-
exposed
= False¶
-
CherryPy Status Policy object class.
-
class
pacifica.policy.status.transaction_query.
TransactionQuery
[source]¶ CherryPy root object class.
-
exposed
= False¶
-
CherryPy Status Policy object class.
-
class
pacifica.policy.status.user_query.
UserQuery
[source]¶ CherryPy root object class.
-
exposed
= True¶
-
CherryPy Project Policy object classes.
CherryPy Status Policy object class.
-
class
pacifica.policy.status.instrument.by_project_id.
InstrumentsByProject
[source]¶ Retrieves instrument list for a given project.
-
static
_get_instruments_for_project
(project_id)[source]¶ Return a list with all the instruments belonging to this project.
-
exposed
= True¶
-
static
CherryPy Status Policy object class.
-
class
pacifica.policy.status.instrument.search.
InstrumentKeywordSearch
[source]¶ Retrieves a set of projects for a given keyword set.
-
_clean_up_instrument_list
(inst_response, user_id)[source]¶ Clear out entries that done belong to this user.
-
_get_instruments_for_keywords
(user_id, search_terms='')[source]¶ Return a list with all the instruments having this term.
-
static
_squash_output_list
(inst_for_user_list, full_inst_list)[source]¶ Filter entries in the full instrument list.
-
exposed
= True¶
-
CherryPy Project Policy object classes.
CherryPy Status Policy object class.
-
class
pacifica.policy.status.project.by_user.
ProjectUserSearch
[source]¶ Retrieves project list for a given user.
-
static
_get_projects_for_user
(user_id=None)[source]¶ Return a list with all the projects involving this user.
-
exposed
= True¶
-
static
CherryPy Status Policy object class.
-
class
pacifica.policy.status.project.lookup.
ProjectLookup
[source]¶ Retrieves details of a given project.
-
exposed
= True¶
-
CherryPy Status Policy object class.
-
class
pacifica.policy.status.project.search.
ProjectKeywordSearch
[source]¶ Retrieves a set of projects for a given keyword set.
-
_get_projects_for_keywords
(user_id, search_terms=None)[source]¶ Return a list with all the projects involving this user.
-
exposed
= True¶
-
CherryPy Uploader Policy object class.
CherryPy Status Policy object class.
-
class
pacifica.policy.status.transaction.files.
FileLookup
[source]¶ Retrieves files for a given transaction_id.
-
static
_get_file_list
(transaction_id=None)[source]¶ Return files for the specified transaction entry.
-
exposed
= True¶
-
static
CherryPy Status Policy object class.
-
class
pacifica.policy.status.transaction.lookup.
TransactionLookup
[source]¶ Retrieves details of a given project.
-
static
_get_transaction_details
(transaction_id=None)[source]¶ Return details for the specified transaction entry.
-
exposed
= True¶
-
static
CherryPy Status Policy object class.
-
class
pacifica.policy.status.transaction.search.
TransactionSearch
[source]¶ Retrieves a set of transactions for a given keyword set.
-
static
_get_transactions_for_keywords
(kwargs, option=None)[source]¶ Return a list with all the projects involving this user.
-
exposed
= True¶
-
static
CherryPy Uploader Policy object class.
CherryPy Status Policy object class.
-
class
pacifica.policy.status.user.lookup.
UserLookup
[source]¶ Retrieves info for the specified user.
-
exposed
= True¶
-
CherryPy Status Policy object class.
Uploader Python Module¶
CherryPy Uploader Policy object class.
The CherryPy rest object for the structure.
Admin Python Module¶
The Admin module has logic about checking for admin group info.
-
class
pacifica.policy.admin.
AdminPolicy
[source]¶ Enforces the admin policy.
Base class for checking for admin group membership or not.
-
all_instruments_url
= 'http://localhost:8121/instruments'¶
-
all_projects_url
= 'http://localhost:8121/projects'¶
-
all_relationships_url
= 'http://localhost:8121/relationships'¶
-
all_users_url
= 'http://localhost:8121/users'¶
-
inst_group_url
= 'http://localhost:8121/instrument_group'¶
-
inst_user_url
= 'http://localhost:8121/instrument_user'¶
-
md_url
= 'http://localhost:8121'¶
-
proj_instrument_url
= 'http://localhost:8121/project_instrument'¶
-
proj_user_url
= 'http://localhost:8121/project_user'¶
-
Admin Command Python Module¶
Config Python Module¶
Configuration reading and validation module.
Data Release Python Module¶
Globals Python Module¶
Global static variables.
Root Rest Python Module¶
CherryPy root object class.
Search Render Python Module¶
This is the render object for the search interface.
Search Sync Python Module¶
Sync the database to elasticsearch index for use by Searching tools.
-
pacifica.policy.search_sync.
create_worker_threads
(threads, work_queue)[source]¶ Create the worker threads and return the list.
-
pacifica.policy.search_sync.
generate_work
(items_per_page, work_queue, time_ago)[source]¶ Generate the work from the db and send it to the work queue.
Validation Python Module¶
Validation methods for various objects.
-
pacifica.policy.validation.
_get_check_id
(index, *args, **kwargs)[source]¶ Return the check ID in args or kwargs.
WSGI Python Module¶
This is the main policy server script.
This is the policy module.
\ Sort by:\ best rated\ newest\ oldest\
\\
Add a comment\ (markup):
\``code``
, \ code blocks:::
and an indented block after blank line