README.txt
## AOL log analysis
Code to manipulate the AOL logs
### Code organization
The code is organized in 3 parts:
* Java code, implements the logic;
* Java CLI: the package it.cnr.isti.hpc.ql.aol.cli, implements commands that can be run from the terminal;
* scripts directory (/scripts) bash scripts that call the java code (if you did not develop the code, you just need the scripts).
### Json Logs
each event in the aol log is converted to a json format with the schema in ''it.cnr.isti.hpc.ql.aol.Event''. An event
could be a click, or a query and the schema is:
* String userId;
* String query;
* Date time;
* Long timeInMillis;
* Integer rank;
* String click;
* int frequency = 1 (the frequency of the query in the log)
* Type type;
### Available scripts:
1. if you modified the java, or at the first checkout, before running scripts run:
maven assembly:assembly -DskipTests
2. NOTE: scripts must be run from the project folder! you can set java properties or other config details from scripts/config.sh
* scripts/convert-aol-to-json.sh aol-logs.txt.gz aol-logs.json - convert the logs to the json format
### Todo
1. sessions: write cli that takes the query logs in json and produce a list of sessions (one record per session, every session is a list of events + other metadata)