FTP/SFTP Data Connector
FTP (File Transfer Protocol) and SFTP (SSH File Transfer Protocol) are network protocols used for transferring files between a client and server, with FTP being less secure and SFTP providing encrypted file transfer over SSH.
The FTP/SFTP Data Connector enables federated SQL query across supported file formats stored on FTP/SFTP servers.
Quickstart​
Connect to an SFTP server and query CSV files:
datasets:
- from: sftp://remote-sftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
sftp_port: 22
sftp_user: my-sftp-user
sftp_pass: ${secrets:my_sftp_password}
Configuration​
from​
The from field takes one of two forms: ftp://<host>/<path> or sftp://<host>/<path> where <host> is the host to connect to and <path> is the path to the file or directory to read from.
If a folder is provided, all child files will be loaded.
name​
The dataset name used as the table name in SQL queries. Cannot be a reserved keyword.
params​
FTP​
| Parameter Name | Description |
|---|---|
file_format | Required when connecting to a directory. See File Formats. |
ftp_user | Username for FTP authentication. |
ftp_pass | Password for FTP authentication. Use secrets syntax: ${secrets:my_ftp_pass}. |
ftp_port | FTP server port. Default: 21. |
client_timeout | Connection timeout duration. E.g. 30s, 1m. No timeout when unset. |
hive_partitioning_enabled | Enable Hive-style partitioning from folder structure. Default: false. |
SFTP​
| Parameter Name | Description |
|---|---|
file_format | Required when connecting to a directory. See File Formats. |
sftp_user | Username for SFTP authentication. |
sftp_pass | Password for SFTP authentication. Use secrets syntax: ${secrets:my_sftp_pass}. |
sftp_port | SFTP server port. Default: 22. |
client_timeout | Connection timeout duration. E.g. 30s, 1m. No timeout when unset. |
hive_partitioning_enabled | Enable Hive-style partitioning from folder structure. Default: false. |
Examples​
Connecting to FTP​
- from: ftp://remote-ftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
ftp_user: my-ftp-user
ftp_pass: ${secrets:my_ftp_password}
hive_partitioning_enabled: false
Connecting to SFTP​
- from: sftp://remote-sftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
sftp_port: 22
sftp_user: my-sftp-user
sftp_pass: ${secrets:my_sftp_password}
hive_partitioning_enabled: false
Secrets​
Spice integrates with multiple secret stores for secure credential management. Store FTP/SFTP credentials in a secret store and reference them using the ${secrets:key} syntax.
datasets:
- from: sftp://files.example.com/data/
name: secure_data
params:
file_format: parquet
sftp_user: ${secrets:sftp_username}
sftp_pass: ${secrets:sftp_password}
For detailed information, refer to the secret stores documentation.
Troubleshooting​
Connection Timeouts​
If connections frequently timeout, increase the client_timeout value:
params:
client_timeout: 120s
Authentication Failures​
Verify credentials are correctly stored in your secret store and that the user has read access to the specified path on the server.
File Format Errors​
When connecting to a directory, ensure file_format is specified and matches the actual file types in the directory. Spice expects all files in a directory to have the same format.
Cookbook​
Refer to the FTP cookbook recipe to see an example of the FTP connector in use.
