Skip to content

Latest commit

 

History

History
23 lines (14 loc) · 1.41 KB

README.md

File metadata and controls

23 lines (14 loc) · 1.41 KB

Justicio Web Scraping to SQL

aws

Justicio is a Question/Answering Assistant that generates answers from user questions about the official state gazette of Spain: Boletín Oficial del Estado (BOE).

At this moment we are running a user-free service: Website and Repository

All BOE articles are embedded in vectors and stored in a vector database. When a question is asked, the question is embedded in the same latent space and the most relevant text is retrieved from the vector database by performing a query using the embedded question. The retrieved pieces of text are then sent to the LLM to construct an answer.

Tech Stack

Jupyter Notebook MySQL Python Pandas

Contributions

Web scraping of the municipal regulations of La Coruña and Oviedo and saving the file in an SQL dump for further usage in the vector database.