This contains scripts I wrote for crawling the multilingual and rich information plenary sessions from Europarl website (http://www.europarl.europa.eu/).
I called this resulting corpus as RIE which represents for Rich Information Europarl.
RIE was used in the following work:
Incorporating Side Information into Recurrent Neural Network Language Models
Cong Duy Vu Hoang, Reza Haffari and Trevor Cohn.
In Proceedings of NAACL-16 (short), 2016.