Stars
Classifier and Feature Extraction scripts used for the CookieBlock extension.
Repository for the CookieBlock browser extension, which automatically enforces user privacy policy on browser cookies.
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
Open source annotation tool for machine learning practitioners.
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Toolchain to retrieve and parse privacy policies from websites as described in our paper "Unifying Privacy Policy Detection" by Henry Hosseini, Martin Degeling, Christine Utz, and Thomas Hupperich.…
Classification of Tilt Document Entries in Privacy Policies.
Privacy Bot gathers, persists and analyzes privacy policies. #Mozilla Global Sprint Project
Fast computation of Krippendorff's alpha agreement measure in Python.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
modular-ml / wrapyfi-examples_llama
Forked from meta-llama/llamaInference code for facebook LLaMA models with Wrapyfi support
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
LambdaLabsML / llama
Forked from shawwn/llamaInference code for LLaMA models
A PyTorch native platform for training generative AI models
A script tool which recut the original llama3_70B_instruct model into 2 or 4 shards, so that one can run the model efficiently on a `2x80GB` or `4x40GB` GPUs environments.
A simple and efficient llama3 local service deployment solution that supports real-time streaming response and is optimized for common Chinese character garbled characters.
Artifacts of the paper "Arcanum: Detecting and Evaluating the Privacy Risks of Browser Extensions on Web Pages and Web Content" in USENIX Security Symposium 2024
A tokenizer and sentence splitter for German and English web and social media texts.
Results and data from the paper "We Value Your Privacy ... Now Take Some Cookies: Measuring the GDPR’s Impact on Web Privacy"
Data set of top third party web domains with rich metadata about them
This tool converts HTML representations of privacy policies to plaintext. Full details of the approach can be found in Appendix A of our PolicyLint paper in USENIX Security 2019.