Write UDFs
Follow these steps to write a User Defined Function (UDF).
- Decorate a function with
@fused.udf
- Declare the function logic
- Optionally cache parts of the function
- Set typed parameters to dynamically run based on inputs
- Import utility modules to keep your code organized
- Return a vector table or raster
- Save the UDF
@fused.udf
decorator
First decorate a Python function with @fused.udf
to tell Fused to treat it as a UDF.
Function declaration
Next, structure the UDF's code. Declare import statements within the function body, express operations to load and transform data, and define a return statement. This UDF is called udf
and returns a pd.DataFrame
object.
@fused.udf # <- Fused decorator
def udf(name: str = "Fused"): # <- Function declaration
import pandas as pd
return pd.DataFrame({'message': [f'Hello {name}!']})
The UDF Builder in Workbench imports the fused
module automatically. To write UDFs outside of Workbench, install the Fused Python SDK with pip install fused
and import it with import fused
.
Placing import statements within a UDF function body (known as "local imports") is not a common Python practice, but there are specific reasons to do this when constructing UDFs. UDFs are distributed to servers as a self-contained units, and each unit needs to import all modules it needs for its execution. UDFs may be executed across many servers (10s, 100s, 1000s), and any time lost to importing unused modules will be multiplied.
An exception to this convention is for modules used for function annotation, which need to be imported outside of the function being annotated.
@fused.cache
decorator
Use the @fused.cache decorator to persist a function's output across runs so UDFs run faster.
@fused.udf # <- Fused decorator
def udf(bbox: fused.types.Bbox = None, name: str = "Fused"):
import pandas as pd
@fused.cache # <- Cache decorator
def structure_output(name):
return pd.DataFrame({'message': [f'Hello {name}!']})
df = structure_output(name)
return df
Typed parameters
UDFs resolve input parameters to the types specified in their function annotations. This ensures parameters serialized in HTTP calls resolve to their intended types at run time.
This example shows the bbox
parameter typed as fused.types.Bbox
and name
as a string.
@fused.udf
def udf(
bbox: fused.types.Bbox = None, # <- Typed parameters
name: str = "Fused"
):
To write UDFs that run successfully as both File
and Tile
, set bbox
as the first parameter, with None
as its default value. This enables the UDF to be invoked successfully both as File
(when bbox
isn't passed) and as Tile
. For example:
@fused.udf
def udf(bbox: fused.types.Bbox = None):
...
return ...
Supported types
Fused supports the native Python types int
, float
, bool
, list
, dict
, and list
. Parameters without a specified type are handled as strings by default.
The UDF Builder runs the UDF as a Map Tile if the first parameter is typed as fused.types.Bbox
, fused.types.TileXYZ
, or fused.types.TileGDF
.
pd.DataFrame
as JSON
Pass tables and geometries as serialized UDF parameters in HTTPS calls. Serialized JSON and GeoJSON parameters can be casted as a pd.DataFrame
or gpd.GeoDataFrame
. Note that while Fused requires import statements to be declared within the UDF signature, libraries used for typing must be imported at the top of the file.
import geopandas as gpd
import pandas as pd
@fused.udf
def udf(
gdf: gpd.GeoDataFrame = None,
df: pd.DataFrame = None
):
Reserved parameters
When running a UDF with fused.run
, it's possible to specify the map tile Fused will use to structure the bbox
object by using the following reserved parameters.
With x
, y
, z
parameters
fused.run("UDF_Overture_Maps_Example", x=5241, y=12662, z=15)
With a bbox
GeoDataFrame
import geopandas as gpd
bbox = gpd.GeoDataFrame.from_features({"type":"FeatureCollection","features":[{"type":"Feature","properties":{},"geometry":{"coordinates":[[[-122.41152460661726,37.80695951427788],[-122.41152460661726,37.80386837460925],[-122.40744576928229,37.80386837460925],[-122.40744576928229,37.80695951427788],[-122.41152460661726,37.80695951427788]]],"type":"Polygon"},"id":1}]})
fused.run("UDF_Overture_Maps_Example", bbox=bbox)
With a bbox
bounds
array
fused.run('UDF_Overture_Maps_Example', bbox=[-122.349, 37.781, -122.341, 37.818])
utils
Module
Define a UDF's utils
Module file in the Workbench "Module" tab and import it in the UDF. Use it to modularize code to make it readable, maintainable, and reusable.
from utils import function
Import utils from other UDFs
UDFs import the utils
Module from other UDFs with fused.load
in the UDFs GitHub repo or private GitHub repos. Here the commit SHA 05ba2ab
pins utils
to specific commit for version control.
utils = fused.load(
"https://github.com/fusedio/udfs/tree/05ba2ab/public/common/"
)
Modules in the public UDFs repo are imported from fused.utils
.
utils = fused.utils.common
utils
Module are imported from other UDFs in a user's account.
utils = fused.load("your@email.com/my_udf").utils
return
object
UDFs return either a table or an array.
- Tables can be:
pd.DataFrame
,pd.Series
,gpd.GeoDataFrame
,gpd.GeoSeries
, andshapely geometry
. - Arrays can be:
numpy.ndarray
,xarray.Dataset
,xarray.DataArray
, andio.BytesIO
. Fused Workbench only supports the rendering ofuint8
arrays. Rasters without spatial metadata should indicate their tile bounds.
Save UDFs
UDFs exported from the UDF Builder or saved locally are formatted as a .zip
file containing associated files with the UDFs code, utils
Module, metadata, and README.md
.
└── Sample_UDF
├── README.MD # Description and metadata
├── Sample_UDF.py # UDF code
├── meta.json # Fused metadata
└── utils.py # `utils` Module
When outside of Workbench, save UDF to your local filesystem with my_udf.to_directory('Sample_UDF')
and to the Fused cloud with my_udf.to_fused()
.
Debug UDFs
UDF builder
A common approach to debug UDFs is to show intermediate results in the UDF Builder results panel with print
statements.
HTTP requests
When using HTTP requests, any error messages are included in the X-Fused-Metadata
response header. These messages can be used to debug. To inspect the header on a browser, open the Developer Tools network tab.