0% found this document useful (0 votes)

6 views5 pages

Round 1A

The 'Connecting the Dots' Challenge invites participants to transform PDFs into intelligent, interactive experiences by extracting structured outlines and building a web application using Adobe's PDF Embed API. The challenge consists of two rounds: the first focuses on creating an outline extractor for PDFs, while the second involves developing a user-friendly web app. Participants must adhere to specific requirements, including Docker compatibility and performance constraints, while aiming for high accuracy in heading detection and efficient processing.

Uploaded by

koyyadaanusha05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views5 pages

Round 1A

Uploaded by

koyyadaanusha05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Welcome to the “Connecting the Dots” Challenge

Rethink Reading. Rediscover Knowledge

What if every time you opened a PDF, it didn’t just sit there—it spoke to you,
connected ideas, and narrated meaning across your entire library?

That’s the future we’re building — and we want you to help shape it.

In the Connecting the Dots Challenge, your mission is to reimagine the humble
PDF as an intelligent, interactive experience—one that understands structure,
surfaces insights, and responds to you like a trusted research companion.

The Journey Ahead

• Round 1:

Kick things off by building the brains — extract structured outlines from
raw PDFs with blazing speed and pinpoint accuracy. Then, power it up
with on-device intelligence that understands sections and links related
ideas together.

• Round 2:

It’s showtime! Build a beautiful, intuitive reading webapp using Adobe’s

PDF Embed API. You will be using your Round 1 work to design a
futuristic webapp.

Why This Matters

In a world flooded with documents, what wins is not more content — it’s
context. You’re not just building tools — you’re building the future of how we
read, learn, and connect. No matter your background — ML hacker, UI builder,
or insight whisperer — this is your stage.

Are you in?

It’s time to read between the lines. Connect the dots. And build a PDF
experience that feels like magic. Let’s go.
Round 1A: Understand Your Document
Challenge Theme: Connecting the Dots Through Docs

Your Mission

You're handed a PDF — but instead of simply reading it, you're tasked with
making sense of it like a machine would. Your job is to extract a structured
outline of the document — essentially the Title, and headings like H1, H2, and
H3 — in a clean, hierarchical format.

This outline will be the foundation for the rest of your hackathon journey.

Why This Matters

PDFs are everywhere — but machines don’t naturally understand their

structure. By building an outline extractor, you’re enabling smarter document
experiences, like semantic search, recommendation systems, and insight
generation.

What You Need to Build

You must build a solution that:

• Accepts a PDF file (up to 50 pages)

• Extracts:
o Title
o Headings: H1, H2, H3 (with level and page number)
• Outputs a valid JSON file in the format below:
{
"title": "Understanding AI",
"outline": [
{ "level": "H1", "text": "Introduction", "page": 1 },
{ "level": "H2", "text": "What is AI?", "page": 2 },
{ "level": "H3", "text": "History of AI", "page": 3 }
]
}

You Will Be Provided

1. A sample input PDF (e.g., sample.pdf)

2. A sample ground truth output (sample.json) for format clarity
3. Sample Dockerfile
4. Sample Solution

Docker Requirements

•
• Please ensure your Dockerfile is compatible with AMD64
architecture. Since we will build and run the image on an AMD64
machine, your base image and any dependencies should support
linux/amd64. Optionally, you can include the following in your
Dockerfile to explicitly specify the platform: FROM --
platform=linux/amd64 <base_image>
• CPU architecture: amd64 (x86_64)
• No GPU dependencies
• Model size (if used) ≤ 200MB
• Should work offline — no network/internet calls

Expected Execution

We will build the docker image using the following command:

```docker build --platform linux/amd64 -t

mysolutionname:somerandomidentifier```

After building the image, we will run the solution using the run command
specified in the submitted instructions.

```docker run --rm -v $(pwd)/input:/app/input -v $(pwd)/output:/app/output --

network none mysolutionname:somerandomidentifier```

Your container should:

• Automatically process all PDFs from /app/input directory,

generating a corresponding filename.json in /app/output for each
filename.pdf
• output.json

Constraints

Constraint Requirement
Execution ≤ 10 seconds for a 50-page
time PDF
Model size ≤ 200MB (if used)
No internet access
Network
allowed
Must run on CPU (amd64),
your solution should run on
Runtime
the system with 8 CPUs and
16 GB RAM configurations

Scoring Criteria

Max
Criteria
Points
Heading Detection Accuracy (Precision +
25
Recall)
Performance (Time & Size Compliance) 10
Bonus: Multilingual Handling (e.g.,
10
Japanese)
Total 45

Submission Checklist

1. Git Project with a working Dockerfile in the root director and

2. A working Dockerfile
3. All dependencies installed within the container
4. A README.md that explains:
o Your approach
o Any models or libraries used
o How to build and run your solution (This is purely for
documentation purpose, your solution should run using the
“Expected Execution” section above.

Pro Tips

• Don’t rely solely on font sizes for heading level determination —

headings in some PDFs break that assumption.
• Test your solution across both simple and complex PDFs.
• Make your code modular — you’ll reuse this structure in Round
1B.
• Important – Please keep your Git Repo private till the competition
deadline, you will be informed, when to make the repo public.

What Not to Do

• Do not hardcode headings or file-specific logic

• Do not make API or web calls
• Do not exceed the runtime/model size constraints

[[Public Dataset Folder]]

(For Sample Input and Output Files, please refer to the appendix)

6874faecd848a Adobe India Hackathon - Challenge
No ratings yet
6874faecd848a Adobe India Hackathon - Challenge
10 pages
Challenge 1 A - AIH2025 - HelloWorld
No ratings yet
Challenge 1 A - AIH2025 - HelloWorld
10 pages
Problem Statement
No ratings yet
Problem Statement
4 pages
Take-Home Assignment - Build A Google NotebookLM Clone
No ratings yet
Take-Home Assignment - Build A Google NotebookLM Clone
5 pages
Online Assignment Plagiarism Check
No ratings yet
Online Assignment Plagiarism Check
5 pages
GenAI Final Project
No ratings yet
GenAI Final Project
8 pages
HLD LLD Design
No ratings yet
HLD LLD Design
3 pages
Submission Hackathon
No ratings yet
Submission Hackathon
3 pages
D&D Second Brain Setup
No ratings yet
D&D Second Brain Setup
9 pages
Hackathon Siet Problem Statements
No ratings yet
Hackathon Siet Problem Statements
5 pages
Challenge 1 B AIH2025 HelloWorld
No ratings yet
Challenge 1 B AIH2025 HelloWorld
10 pages
Byte Brawl
No ratings yet
Byte Brawl
11 pages
Take-Home Challenge
No ratings yet
Take-Home Challenge
3 pages
Interview Task 1
No ratings yet
Interview Task 1
2 pages
Document RAG Assignment
No ratings yet
Document RAG Assignment
4 pages
Guide To Signlanguage Detection
No ratings yet
Guide To Signlanguage Detection
2 pages
AI Project Challenges for Developers
No ratings yet
AI Project Challenges for Developers
6 pages
Problem Statements
No ratings yet
Problem Statements
8 pages
Fullstack Internship Assignment
No ratings yet
Fullstack Internship Assignment
2 pages
CodeChef-VIT'24 Recruitment Task Sheet - 240229 - 205340
No ratings yet
CodeChef-VIT'24 Recruitment Task Sheet - 240229 - 205340
12 pages
AI Engineer Candidate Task
No ratings yet
AI Engineer Candidate Task
3 pages
Online Judge-HLD Doc
No ratings yet
Online Judge-HLD Doc
8 pages
Problem Statements
No ratings yet
Problem Statements
2 pages
Rvitm 2024 Forge Ai
No ratings yet
Rvitm 2024 Forge Ai
7 pages
GW DEVTrails Usecase Solution
No ratings yet
GW DEVTrails Usecase Solution
3 pages
Acm Hackathon
No ratings yet
Acm Hackathon
2 pages
Post-Interview Evaluation Test1
No ratings yet
Post-Interview Evaluation Test1
2 pages
Hackathon AIForImpact-1
100% (1)
Hackathon AIForImpact-1
3 pages
Innovation Challenge 2025 - AI Hackathon Challenges
No ratings yet
Innovation Challenge 2025 - AI Hackathon Challenges
15 pages
Chat GPT Automated Framework
No ratings yet
Chat GPT Automated Framework
13 pages
UVCE BTech UpsurgeLabs Assignment 2025
No ratings yet
UVCE BTech UpsurgeLabs Assignment 2025
3 pages
Advance Your Coding Skills Prompt
No ratings yet
Advance Your Coding Skills Prompt
2 pages
Instructions - Advanced Code Generation Verifiers
No ratings yet
Instructions - Advanced Code Generation Verifiers
8 pages
Hack Hive 25
No ratings yet
Hack Hive 25
5 pages
Problem Statements For KLEOS 2.0
No ratings yet
Problem Statements For KLEOS 2.0
33 pages
Chat With PDF Specs 2023 03 25
No ratings yet
Chat With PDF Specs 2023 03 25
1 page
Bus
No ratings yet
Bus
3 pages
Hack Hustlers: Keshav Garg - Generative AI Engineer Jatin Raghav - Full Stack Engineer Parv Maurya - UI/UX Designer
No ratings yet
Hack Hustlers: Keshav Garg - Generative AI Engineer Jatin Raghav - Full Stack Engineer Parv Maurya - UI/UX Designer
5 pages
Todo
No ratings yet
Todo
4 pages
Chat With Multiple PDF and Sign Letter Detection
No ratings yet
Chat With Multiple PDF and Sign Letter Detection
10 pages
Backend Developer Assignment
No ratings yet
Backend Developer Assignment
3 pages
Mars Open Projects 2025
No ratings yet
Mars Open Projects 2025
7 pages
CV NguyenVanTuan
No ratings yet
CV NguyenVanTuan
3 pages
Full-Stack Developer Assignment
No ratings yet
Full-Stack Developer Assignment
3 pages
Alcovia - Preprocess Assignment
No ratings yet
Alcovia - Preprocess Assignment
3 pages
Shyena Consultant Ayush S MLOps 5+ Years
No ratings yet
Shyena Consultant Ayush S MLOps 5+ Years
5 pages
Project Checklist - NoteWise Quiz
No ratings yet
Project Checklist - NoteWise Quiz
2 pages
Wa0005.
No ratings yet
Wa0005.
2 pages
1998 - 1000 - DOC - AI-Powered Code Generation
No ratings yet
1998 - 1000 - DOC - AI-Powered Code Generation
5 pages
Team13 SRS
No ratings yet
Team13 SRS
3 pages
PDF Summarizer Project Approval
No ratings yet
PDF Summarizer Project Approval
4 pages
Projects
No ratings yet
Projects
2 pages
Solutions Challenge'25
No ratings yet
Solutions Challenge'25
23 pages
Assignment
No ratings yet
Assignment
2 pages
Automated ML
No ratings yet
Automated ML
4 pages
RP Journal-2
No ratings yet
RP Journal-2
54 pages
Problem Statement
100% (1)
Problem Statement
5 pages
DL 9
No ratings yet
DL 9
10 pages
Precision CMM Solutions for Industry
No ratings yet
Precision CMM Solutions for Industry
8 pages
Eazy Puls Manual 4 MB
No ratings yet
Eazy Puls Manual 4 MB
268 pages
Logcat 1736867313072
No ratings yet
Logcat 1736867313072
57 pages
Furniture Making Level 1 (CVQ)
No ratings yet
Furniture Making Level 1 (CVQ)
215 pages
EPI Prediction with Knime
No ratings yet
EPI Prediction with Knime
11 pages
Unreleased Quorum Based Computations Paper
No ratings yet
Unreleased Quorum Based Computations Paper
19 pages
Azure Notes
100% (1)
Azure Notes
52 pages
Computer Skills 2020
No ratings yet
Computer Skills 2020
86 pages
PSO 600 - Manual
100% (4)
PSO 600 - Manual
54 pages
Web Basics for Beginners
No ratings yet
Web Basics for Beginners
11 pages
Track DDL in Noarchivelog Mode
No ratings yet
Track DDL in Noarchivelog Mode
2 pages
SN32F247 V2.2 en
No ratings yet
SN32F247 V2.2 en
242 pages
What Is BDF File
No ratings yet
What Is BDF File
3 pages
Questions - Mi2026-2021.1
No ratings yet
Questions - Mi2026-2021.1
5 pages
CHAPTER7 Computer Network Protocols
No ratings yet
CHAPTER7 Computer Network Protocols
15 pages
Implementation of Queue Using Array
100% (1)
Implementation of Queue Using Array
42 pages
Arctera Insight Archiving Management Console
No ratings yet
Arctera Insight Archiving Management Console
214 pages
Report
No ratings yet
Report
17 pages
A Hybrid Approach For Mortality Prediction For Heart Patients Using ACO-HKNN 2020
No ratings yet
A Hybrid Approach For Mortality Prediction For Heart Patients Using ACO-HKNN 2020
8 pages
Class15 - Data Warehousing
No ratings yet
Class15 - Data Warehousing
76 pages
Relations and Relational Algebra: Database Systems Lecture 2 Natasha Alechina WWW - Cs.nott - Ac.uk/ nza/G51DBS
No ratings yet
Relations and Relational Algebra: Database Systems Lecture 2 Natasha Alechina WWW - Cs.nott - Ac.uk/ nza/G51DBS
42 pages
OEC 9800 Battery Charger Theory
No ratings yet
OEC 9800 Battery Charger Theory
5 pages
Lecture 10 Expressions Computation Logic
No ratings yet
Lecture 10 Expressions Computation Logic
12 pages
Destination Technologies - TCS Ninja Training (Coding Round)
No ratings yet
Destination Technologies - TCS Ninja Training (Coding Round)
11 pages
CentOS 5 Mail Server Setup Guide
No ratings yet
CentOS 5 Mail Server Setup Guide
22 pages
Kameleonfuzz-Evolutionary Blackbox XSS Fuzzing-Duchene-Codaspy 2014
No ratings yet
Kameleonfuzz-Evolutionary Blackbox XSS Fuzzing-Duchene-Codaspy 2014
13 pages
Intelligent Control Systems: An Introduction
No ratings yet
Intelligent Control Systems: An Introduction
29 pages
Prasath Resume For Industrail & Process Engineer
No ratings yet
Prasath Resume For Industrail & Process Engineer
3 pages
DSA To Development
No ratings yet
DSA To Development
13 pages
Ravi Pandey - BPS
No ratings yet
Ravi Pandey - BPS
4 pages

Round 1A

Uploaded by

Round 1A

Uploaded by

Welcome to the “Connecting the Dots” Challenge

Rethink Reading. Rediscover Knowledge

The Journey Ahead

It’s showtime! Build a beautiful, intuitive reading webapp using Adobe’s

Why This Matters

Are you in?

Why This Matters

PDFs are everywhere — but machines don’t naturally understand their

What You Need to Build

You must build a solution that:

• Accepts a PDF file (up to 50 pages)

You Will Be Provided

1. A sample input PDF (e.g., sample.pdf)

We will build the docker image using the following command:

```docker build --platform linux/amd64 -t

```docker run --rm -v $(pwd)/input:/app/input -v $(pwd)/output:/app/output --

Your container should:

• Automatically process all PDFs from /app/input directory,

1. Git Project with a working Dockerfile in the root director and

• Don’t rely solely on font sizes for heading level determination —

• Do not hardcode headings or file-specific logic

[[Public Dataset Folder]]

You might also like