Probabilistic Database

`main.py`

main.py controls the overall logic of the program. The extension is also invoked in the main.py program.

Files needed for Running `main.py`

query file
table files
Predicate.py
Variable.py
GibbsSampling.py
Lift.py

Dependencies

pandas
numpy
matplotlib
progressbar

These dependencies are in the requiremens.txt file.

To install them, please run

pip install -r requirements.txt

Sample Command for running main.py

Python main.py --table t2.txt --query query.txt --table t1.txt --table t3.txt --table t4.txt

Generating Random Data for Testing

File: DataGenerator.py

How to specify the number of tuples you want:

n is not the exact number of tuples in the database. The number of tuples is on the order of $n^2$ .

change the n value at line 51 to ajust

if __name__ == "__main__":
    n = 100 #<--------change this n value
    print("Generate data points of size: " + str(n))
    DataGenerator(n)

P, Q and R tables and they are similar to the input given in the examples P(x), Q(x), R(x, y) and T(x,y)

Tables Generated

test_table1.txt
test_table2.txt
test_table3.txt
test_table4.txt

Extension

The implementation of the gibbs sampling extension is located in the GibbsSampling.py file. This file is called in the main.py program. You can change the number of steps,num_step, for each sampling process in the main.py program. If num_step is large, approximation result is closed to the real value, but it might take longer time compute.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
dbfile		dbfile
test case 2		test case 2
.gitignore		.gitignore
DataGenerator.py		DataGenerator.py
GibbsSampling.py		GibbsSampling.py
GibbsSampling.pyc		GibbsSampling.pyc
Lift.py		Lift.py
Lift.pyc		Lift.pyc
Performance.png		Performance.png
Predicate.py		Predicate.py
Predicate.pyc		Predicate.pyc
Query Parser and its input form.ipynb		Query Parser and its input form.ipynb
README.md		README.md
T.txt		T.txt
Test.ipynb		Test.ipynb
Untitled.ipynb		Untitled.ipynb
Variable.py		Variable.py
Variable.pyc		Variable.pyc
main.py		main.py
pdb.ipynb		pdb.ipynb
pdb_1.ipynb		pdb_1.ipynb
query.txt		query.txt
t1.txt		t1.txt
t2.txt		t2.txt
t3.txt		t3.txt
t4.txt		t4.txt
test.txt		test.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Probabilistic Database

`main.py`

Files needed for Running `main.py`

Dependencies

Sample Command for running main.py

Generating Random Data for Testing

Tables Generated

Extension

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

DanioPurdue/ProbabilisticDatabase

Folders and files

Latest commit

History

Repository files navigation

Probabilistic Database

main.py

Files needed for Running main.py

Dependencies

Sample Command for running main.py

Generating Random Data for Testing

Tables Generated

Extension

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

`main.py`

Files needed for Running `main.py`

Packages