README.txt

## AOL log analysis

Code to manipulate the AOL logs

### Code organization
The code is organized in 3 parts:

* Java code, implements the logic;
* Java CLI: the package it.cnr.isti.hpc.ql.aol.cli, implements commands that can be run from the terminal;
* scripts directory (/scripts) bash scripts that call the java code (if you did not develop the code, you just need the scripts).

### Json Logs

each event in the aol log is converted to a json format with the schema in ''it.cnr.isti.hpc.ql.aol.Event''. An event
could be a click, or a query and the schema is:

* String userId;
* String query;
* Date time;
* Long timeInMillis;
* Integer rank;
* String click;
* int frequency = 1 (the frequency of the query in the log)
* Type type;

### Available scripts:
1. if you modified the java, or at the first checkout, before running scripts run:

maven assembly:assembly -DskipTests

2. NOTE: scripts must be run from the project folder! you can set java properties or other config details from scripts/config.sh

* scripts/convert-aol-to-json.sh aol-logs.txt.gz aol-logs.json - convert the logs to the json format

### Todo

1. sessions: write cli that takes the query logs in json and produce a list of sessions (one record per session, every session is a list of events + other metadata)