Memex Explorer
Memex Explorer is a web application that registers microservices. Each microservice provides a specific functionality—at present, these include:
- Crawl Space A microservice to create, run, and analyze Nutch and ACHE crawls. The crawl operation is heavily abstracted and simplified. Users provide a list of seed URLs to start the crawl, and in the case of ACHE’s targeted crawling, a machine learning model to declare the relevancy of crawled pages.
- Image Space A microservice for the display and searching of images and their associated metadata. As a basic example of microservice interoperation, users will be able to dump images from Nutch crawls into an Image Space for analysis. Users can also search for images that match on the registered camera serial number, as well as upload their own images into an Image Space for comparison.
Home Page
The landing page lists currently registered projects. All microservice features are connected to, and accessed through, a given project. To create a project, click on the “New Project” button.
Project Page
The project page lists actions provided by registered microservices. Click on the gear icon to change the project name or description.
In the current Memex Explorer, only the Crawl Space microservice has been registered. Therefore, each project page will list the current crawls and crawl models.
Registering a Crawl Model
ACHE crawls require a Crawl Model to power the page classifier. The model consists of two elements: a “model” file and “features” file. These can be generated by following the instructions on the ACHE page.
To register a new crawl model, click on the ‘+’ icon in the Crawl Models header. This will take you to the crawl model registration page.
Registering a Crawl
To register a new crawl, click the ‘+’ icon in the Crawls header. This will take you to the crawl registration page.
For Nutch crawls, you will need to provide a name, a description, and a seeds list text file containing newline-delimited URLs.
For ACHE crawls, you will need to provide the same inputs as above, and further select a Crawl Model.
memex-explorer
Memex explorer application written in Django 1.7
- To setup the environment, do the following:
$ wget http://bit.ly/miniconda
$ bash miniconda
Then, navigate to the repository root and run these commands:
$ conda env update
$ source activate memex
- Before you create the database, you need to copy one of the settings files that you need to use. There are two settings files, one for development and one for deployment. Do the following:
$ cd source
$ cp memex/settings_files/dev_settings.py memex/settings.py
(or)
$ cp memex/settings_files/deploy_settings.py memex/settings.py
- To setup the application, after creating the
memex
environment, run these commands in thesource
folder. This will create the database for the application using the migration scripts provided in the source code:
$ python manage.py migrate
- Then, in the same folder run this command to launch the application as a local server:
$ python manage.py runserver
- To set up superuser access to the administrative panel, run this command and provide a username, email, and password:
$ python manage.py createsuperuser
- To access the administration panel navigate to http://localhost:8000/admin after running
python manage.py runserver
. Here you will be able to view and make manual changes to the database. - To run the tests, run the command:
$ py.test
Starting Celery
- Memex Explorer relies on both redis and Celery to manage tasks. To start the celery worker, run these two commands from the source directory:
$ redis-server
$ celery -A memex worker -l info
Installing Compass
If you need to make changes to the .scss stylesheets, Compass is a useful tool. The following are instructions on how to install compass without using
sudo
.- For mac users, add this line to your
~/.bash_profile
:
export PATH=/Users/<username>/.gem/ruby/<ruby version>/bin:$PATH
Then run
$ gem install compass --user-install
. This will install Compass on your system.- To make changes to the stylesheets, do:
$ cd ../
$ compass watch
NOTE: Memex Explorer is still under active development
Source && Download
0 comments: