Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
-
Updated
Apr 9, 2025 - Python
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
Samples on how to use Azure SQL database with Azure OpenAI
Implementation of TextRank with the option of using pre-trained Word2Vec embeddings as the similarity metric
String similarity functions, String distance's, Jaccard, Levenshtein, Hamming, Jaro-Winkler, Q-grams, N-grams, LCS - Longest Common Subsequence, Cosine similarity...
Spark functions to run popular phonetic and string matching algorithms
Sentential Semantic Similarity measurement library using BERT Embeddings for spatial distance evaluation.
Developed a book recommendation system for Amazon customers using memory and model based collaborative filtering by utilizing the description of book consumed and user interests.
A Clojure library for querying large data-sets on similarity
Practical experiments on Machine Learning in Python. Processing of sentences and finding relevant ones, approximation of function with polynomials, function optimization
String distances in rust
Recommending movies using tweets as a proxy
A collection of diverse recommendation system projects, spanning collaborative filtering, content-based methods, and hybrid approaches.
This repository contains a web application that integrates with a music recommendation system, which leverages a dataset of 3,415 audio files, each lasting thirty seconds, utilising a Locality-Sensitive Hashing (LSH) implementation to determine rhythmic similarity, as part of an assignment for the Fundamental of Big Data Analytics (DS2004) course.
It is a replication of google image search engine for finding similar images in our database using artificial intelligence. It is a project which uses cosine distance or finding the similarity which is an amazing application of cosine similarity.
Link prediction - Who are my friends?
Efficient Pairwise Cosine Similarity Computation
Demos to test modelling and classification algorithms for face recognition
A content based recommendation system
Given a directed social graph, have to predict missing links to recommend users.
Backend application for javascript snippet search engine. Data.csv is from 30 seconds of code's database, https://github.com/30-seconds/30-seconds-of-code/tree/master/snippets
Add a description, image, and links to the cosine-distance topic page so that developers can more easily learn about it.
To associate your repository with the cosine-distance topic, visit your repo's landing page and select "manage topics."