This guide provides instructions to integrate the Faasera SDK within your data pipelines using either Java or PySpark environments.
The Faasera SDK enables you to embed core capabilities like profiling, masking, and validation directly into your ETL or ML workflows using Java or PySpark. This is ideal for high-throughput, low-latency scenarios where data privacy must be enforced in real-time.
| Environment | Integration Mode | SDK Format |
|---|---|---|
| Java | JAR / Maven | faasera-core.jar |
| PySpark | pip / .whl |
faasera_sdk.whl |
| Databricks | Notebook / Wheel | %pip install |
If using Maven:
<dependency>
<groupId>ai.faasera</groupId>
<artifactId>faasera-core</artifactId>
<version>1.0.0</version>
</dependency>
Or download the JAR:
wget https://repo.faasera.ai/releases/faasera-core-1.0.0.jar
FaaseraMaskingSDK sdk = new FaaseraMaskingSDK();
sdk.initialize("path/to/policy.json");
Dataset<Row> maskedDF = sdk.mask(inputDF);
inputDF.createOrReplaceTempView("input_table");
spark.udf().register("mask_udf", sdk.getUDF());
spark.sql("SELECT mask_udf(column1) FROM input_table").show();
%pip install faasera_sdk-1.0.0-py3-none-any.whl
from faasera_sdk import FaaseraProfiler
profiler = FaaseraProfiler(policy_path="policy.json")
profiled_df = profiler.profile(spark_df)
from faasera_sdk import FaaseraMasker
masker = FaaseraMasker(policy_path="policy.json")
masked_df = masker.mask(spark_df)
You can run the SDK with sample CSVs or Parquet files:
java -jar faasera-core.jar --mode=mask --input=./data.csv --output=./masked.csv
Or use PySpark CLI:
pyspark --packages faasera_sdk
"enabled": false in policy)For SDK-based deployments, a license token (offline or cloud-linked) is required. Configure the license path as:
faasera.license.path=/etc/faasera/license.jwt
Or set via environment:
export FAASERA_LICENSE_PATH=/etc/faasera/license.jwt
Need help? Visit faasera.ai/docs or reach out to support@faasera.ai.