Skip to main content

FAQ

General questions

Whom is Fused for?

Fused is designed for teams seeking to simplify their workflows and accelerate the creation and delivery of data products. It's ideal for organizations that need a scalable solution to handle growing data sizes while minimizing the time spent on data engineering.

Why Python, when there's spatial SQL?

Python is the go-to language for spatial data science. Although spatial SQL joins and transformations can be efficiently performed using PostGIS in an external database, you may eventually need to convert that data to Pandas and NumPy for further processing and analysis, especially for detailed operations on raster arrays. Additionally, you can run SQL directly on Fused using Python libraries like DuckDB, combining the strengths of both approaches.

What's the benefit of geo partitioning vector tables?

It enables efficient reading of large datasets by strategically partitioning GeoParquet files. Fused's GeoParquet format includes metadata that allows for spatial filtering of any dataset, loading only the chunks relevant to a specific area of interest. This approach reduces memory usage and allows you to work with any size dataset with just Python.

When should I ingest a file vs. load it as is?

You should ingest a file if it has a spatial component and you plan to visualize it or use it for downstream analysis. Ingesting allows for more efficient and lightweight repeated access. On the other hand, if the file is small (under 100 MB), fits into memory, and is intended for a one-off operation, you should load it as is. This approach avoids the overhead of ingestion for single-use or infrequent access scenarios.

Which authentication methods do you support?

Fused currently uses Auth0 to support authentication via Google and GitHub.

How do we configure Github integration?

To configure the integration, connect your GitHub repository and provide us with the repository name and details. We'll activate it for you.

Is there a way to set environment variables or secrets/API keys?

Save environment variables to an .env file as shown here to make them available to UDFs as environment variables. Secrets can be stored in the secrets manager.

How can I share utility modules between UDFs?

This docs page explains the various ways a UDF can import utility modules from other UDFs.

A common practice for maintaining a set of shareable modules is to write utility functions in a UDF and have other UDFs import those functions. As an example, see Fused's common utils.py.

How can I change the basemap in the UDF Builder?

Set it in the map style settings, located at the top right of the UDF Builder map. Currently, light, dark, satellite, and blank basemaps are supported.

File
What is "Zoom to layer" and when would I use it?

If a default view state is set for the UDF, the viewport will zoom to that predefined extent whenever the UDF first loads or when the user clicks the "Zoom to layer" button. This feature is useful for centering the map on a specific area of interest when users open the UDF.

If the UDF doesn't have a default view state set, the "Zoom to layer" button will zoom to the extent of the data in the map.

How can I run a UDF asynchronously?

To run a UDF asynchronously, use Python's async/await syntax as shown here. This approach allows multiple UDF calls to run in parallel, each with different parameters, and enables you to combine the results once all calls are complete.

Note: Setting sync=False in fused.run is intended for asynchronous calls when running in the cloud with engine='realtime'. The parameter has no effect if the UDF is ran in the local environment with engine='realtime'.

Troubleshooting

Error: Access is not configured for you in the Fused Workbench. Please refresh the page if you think this is an error, or get in touch if you require further help. Cause: Realtime instance not configured.

This error occurs when you try to run a UDF with an account associated with a workspace environment that does not have a realtime instance configured. This means that there are no worker nodes available to run the UDF. To resolve this issue, please get in touch with the Fused team team to ensure your account is associated with an environment with a realtime instance.

When Troubleshooting this error, it may help to navigate to your account's User Profile page to determine if the account is associated with an environment and realtime instance, as shown here.

File
Error: No such file or directory: '/mnt/cache/'

This error occurs when a UDF attempts to access the /mnt/cache disk when it is not available for the environment. To resolve this issue, please contact the Fused team to ensure that the cache directory is available for your account.

Error: No space left on the device: '/tmp/'

This error occurs when a UDF attempts to write more data than the /tmp directory of the real-time instance can handle. Realtime instances have a limited amount of space available and are ephemeral between runs. You might want to consider writing to /mnt/cache disk instead.

Error: Quota limit: Number of running instances

Fused batch jobs, which are initiated with run_remote, require a server quota to be enabled for your account. These include data ingestion jobs. If you encounter this error, please contact the Fused team to request an increase in the quota allotted to your account.

Error: Application error: a client-side exception has occurred (see browser console for more information).

In the case that you encounter this error, please reset your browser cache and cookies. If the error persists, please contact the Fused team for further assistance.