Lakeflow community connectors are built on top of the Spark Python Data Source API and Spark Declarative Pipeline (SDP). These connectors enable users to ingest data from various source systems.
Each connector consists of two parts:
- Source-specific implementation
- Shared library and SDP definition
Check the sources/ directory for available source connectors, which contain the source-specific implementation of an interface defined in sources/interface/lakeflow_connect.py.
The libs/ and pipeline/ directories include the shared source code across all source connectors.
Users can follow the instructions in prompts/vibe_coding_instruction.md to create new connectors.
This directory includes generic shared test suites to validate any connector source implementation.
- Add dev guidelines
- Add general instruction on how to use community connector (e.g. update spec and create a SDP for ingestion)