Skip to content
SindbadUtils.SindbadUtils Module
julia
SindbadUtils

The SindbadUtils package provides a collection of utility functions and tools for handling data, managing NamedTuples, and performing spatial and temporal operations in the SINDBAD framework. It serves as a foundational package for simplifying common tasks and ensuring consistency across SINDBAD experiments.

Purpose:

This package is designed to provide reusable utilities for data manipulation, statistical operations, and spatial/temporal processing.

Dependencies:

  • Sindbad: Provides the core SINDBAD models and types.

  • Crayons: Enables colored terminal output, improving the readability of logs and messages.

  • StyledStrings: Provides styled text for enhanced terminal output.

  • Dates: Facilitates date and time operations, useful for temporal data processing.

  • FIGlet: Generates ASCII art text, useful for creating visually appealing headers in logs or outputs.

  • Logging: Provides logging utilities for debugging and monitoring SINDBAD workflows.

Included Files:

  1. getArrayView.jl:
  • Implements functions for creating views of arrays, enabling efficient data slicing and subsetting.
  1. utils.jl:
  • Contains general-purpose utility functions for data manipulation and processing.
  1. utilsNT.jl:
  • Provides utilities for working with NamedTuples, including transformations and access operations.
  1. utilsTemporal.jl:
  • Handles temporal operations, including time-based filtering and aggregation.

Exported

SindbadUtils.addPackage Method
julia
addPackage(where_to_add, the_package_to_add)

Adds a specified Julia package to the environment of a given module or project.

Arguments:

  • where_to_add: The module or project where the package should be added.

  • the_package_to_add: The name of the package to add.

Behavior:

  • Activates the environment of the specified module or project.

  • Checks if the package is already installed in the environment.

  • If the package is not installed:

    • Adds the package to the environment.

    • Removes the Manifest.toml file and reinstantiates the environment to ensure consistency.

    • Provides instructions for importing the package in the module.

  • Restores the original environment after the operation.

Notes:

  • This function assumes that the where_to_add module or project is structured with a standard Julia project layout.

  • It requires the Pkg module for package management, which is re-exported from core Sindbad.

Example:

julia
addPackage(MyModule, "DataFrames")
SindbadUtils.booleanizeArray Method
julia
booleanizeArray(_array)

Converts an array into a boolean array where elements greater than zero are true.

Arguments:

  • _array: The input array to be converted.

Returns:

A boolean array with the same dimensions as _array.

SindbadUtils.createTimeAggregator Function
julia
createTimeAggregator(date_vector, t_step, aggr_func = mean, skip_aggregation = false)

a function to create a temporal aggregation struct for a given time step

Arguments:

  • date_vector: a vector of datetime objects that determine the index of the array to be aggregated

  • t_step: a string/Symbol/Type defining the aggregation time target with different types as follows:

    • ::Union{String, Symbol}: a string/Symbol defining the aggregation time target from the settings
  • aggr_func: a function to use for aggregation, defaults to mean

  • skip_aggregation: a flag indicating if the aggregation target is the same as the input data and the aggregation can be skipped, defaults to false

Returns:

  • ::Vector{TimeAggregator}: a vector of TimeAggregator structs

t_step:

TimeAggregation

Abstract type for time aggregation methods in SINDBAD

Available methods/subtypes:

  • TimeAllYears: aggregation/slicing to include all years

  • TimeArray: use array-based time aggregation

  • TimeDay: aggregation to daily time steps

  • TimeDayAnomaly: aggregation to daily anomalies

  • TimeDayIAV: aggregation to daily IAV

  • TimeDayMSC: aggregation to daily MSC

  • TimeDayMSCAnomaly: aggregation to daily MSC anomalies

  • TimeDiff: aggregation to time differences, e.g. monthly anomalies

  • TimeFirstYear: aggregation/slicing of the first year

  • TimeHour: aggregation to hourly time steps

  • TimeHourAnomaly: aggregation to hourly anomalies

  • TimeHourDayMean: aggregation to mean of hourly data over days

  • TimeIndexed: aggregation using time indices, e.g., TimeFirstYear

  • TimeMean: aggregation to mean over all time steps

  • TimeMonth: aggregation to monthly time steps

  • TimeMonthAnomaly: aggregation to monthly anomalies

  • TimeMonthIAV: aggregation to monthly IAV

  • TimeMonthMSC: aggregation to monthly MSC

  • TimeMonthMSCAnomaly: aggregation to monthly MSC anomalies

  • TimeNoDiff: aggregation without time differences

  • TimeRandomYear: aggregation/slicing of a random year

  • TimeShuffleYears: aggregation/slicing/selection of shuffled years

  • TimeSizedArray: aggregation to a sized array

  • TimeYear: aggregation to yearly time steps

  • TimeYearAnomaly: aggregation to yearly anomalies

SindbadUtils.dictToNamedTuple Method
julia
dictToNamedTuple(d::AbstractDict)

Convert a nested dictionary to a NamedTuple.

Arguments

  • d::AbstractDict: The input dictionary to convert

Returns

  • A NamedTuple with the same structure as the input dictionary
SindbadUtils.doNothing Method
julia
doNothing(dat)

Returns the input as is, without any modifications.

Arguments:

  • dat: The input data.

Returns:

The same input data.

SindbadUtils.doTemporalAggregation Function
julia
doTemporalAggregation(dat, temporal_aggregators, aggregation_type)

a temporal aggregation function to aggregate the data using a vector of aggregators

Arguments:

  • dat: a data array/vector to aggregate

  • temporal_aggregators: a vector of time aggregator structs with indices and function to do aggregation

  • aggregation_type: a type defining the type of aggregation to be done as follows:

    • ::TimeNoDiff: a type defining that the aggregator does not require removing/reducing values from original time series

    • ::TimeDiff: a type defining that the aggregator requires removing/reducing values from original time series. First aggregator aggregates the main time series, second aggregator aggregates to the time series to be removed.

    • ::TimeIndexed: a type defining that the aggregator requires indexing the original time series

SindbadUtils.dropFields Method
julia
dropFields(namedtuple::NamedTuple, names::Tuple{Vararg{Symbol}})

Remove specified fields from a NamedTuple.

Arguments

  • namedtuple: The input NamedTuple

  • names: A tuple of field names to remove

Returns

  • A new NamedTuple with the specified fields removed
SindbadUtils.entertainMe Function
julia
entertainMe(n=10, disp_text="SINDBAD")

Displays the given text disp_text as a banner n times.

Arguments:

  • n: Number of times to display the banner (default: 10).

  • disp_text: The text to display (default: "SINDBAD").

  • c_olor: Whether to display the text in random colors (default: false).

SindbadUtils.foldlUnrolled Method
julia
foldlUnrolled(f, x::Tuple{Vararg{Any, N}}; init)

Generate an unrolled expression to run a function for each element of a tuple to avoid complexity of for loops for compiler.

Arguments

  • f: The function to apply

  • x: The tuple to iterate through

  • init: Initial value for the fold operation

Returns

  • The result of applying the function to each element
SindbadUtils.getAbsDataPath Method
julia
getAbsDataPath(info, data_path)

Converts a relative data path to an absolute path based on the experiment directory.

Arguments:

  • info: The SINDBAD experiment information object.

  • data_path: The relative or absolute data path.

Returns:

An absolute data path.

SindbadUtils.getArrayView Function
julia
getArrayView(_dat::AbstractArray{<:Any, N}, inds::Tuple{Vararg{Int}}) where N

Creates a view of the input array _dat based on the provided indices tuple inds.

Arguments:

  • _dat: The input array from which a view is created. Can be of any dimensionality.

  • inds: A tuple of integer indices specifying the spatial or temporal dimensions to slice.

Returns:

  • A SubArray view of _dat corresponding to the specified indices.

Notes:

  • The function supports arrays of arbitrary dimensions (N).

  • For arrays with fewer dimensions than the size of inds, an error is thrown.

  • For higher-dimensional arrays, the indices are applied to the last dimensions, while earlier dimensions are accessed using Colon() (i.e., all elements are included).

  • This function avoids copying data by creating a view, which is efficient for large arrays.

Error Handling:

  • Throws an error if the dimensionality of _dat is less than the size of inds.
SindbadUtils.getCombinedNamedTuple Method
julia
getCombinedNamedTuple(base_nt::NamedTuple, priority_nt::NamedTuple)

Combine property values from base and priority NamedTuples.

Arguments

  • base_nt: The base NamedTuple

  • priority_nt: The priority NamedTuple whose values take precedence

Returns

  • A new NamedTuple combining values from both inputs
SindbadUtils.getNamedTupleFromTable Method
julia
getNamedTupleFromTable(tbl; replace_missing_values=false)

Convert a table to a NamedTuple.

Arguments

  • tbl: The input table

  • replace_missing_values: Whether to replace missing values with empty strings

Returns

  • A NamedTuple representation of the table
SindbadUtils.getSindbadDataDepot Method
julia
getSindbadDataDepot(; env_data_depot_var="SINDBAD_DATA_DEPOT", local_data_depot="../data")

Retrieve the Sindbad data depot path.

Arguments

  • env_data_depot_var: Environment variable name for the data depot (default: "SINDBAD_DATA_DEPOT")

  • local_data_depot: Local path to the data depot (default: "../data")

Returns

The path to the Sindbad data depot.

SindbadUtils.getTimeAggregatorTypeInstance Function
julia
getTimeAggregatorTypeInstance(aggr)

Creates and returns a time aggregator instance based on the provided aggregation.

Arguments

  • aggr::Symbol: Symbol specifying the type of time aggregation to be performed

  • aggr::String: String specifying the type of time aggregation to be performed

Returns

An instance of the corresponding time aggregator type.

Notes:

  • A similar approach getTypeInstanceForNamedOptions is used in SindbadSetup for creating types of other named option
SindbadUtils.getTupleFromLongTuple Method
julia
getTupleFromLongTuple(long_tuple)

Convert a LongTuple to a regular tuple.

Arguments

  • long_tuple: The input LongTuple

Returns

  • A regular tuple containing all elements from the LongTuple
SindbadUtils.makeLongTuple Function
julia
makeLongTuple(normal_tuple; longtuple_size=5)

Arguments:

  • normal_tuple: a normal tuple

  • longtuple_size: size to break down the tuple into

SindbadUtils.makeLongTuple Function
julia
makeLongTuple(normal_tuple; longtuple_size=5)

Create a LongTuple from a normal tuple.

Arguments

  • normal_tuple: The input tuple to convert

  • longtuple_size: Size to break down the tuple into (default: 5)

Returns

  • A LongTuple containing the elements of the input tuple
SindbadUtils.makeNamedTuple Method
julia
makeNamedTuple(input_data, input_names)

Create a NamedTuple from input data and names.

Arguments

  • input_data: Vector of data values

  • input_names: Vector of names for the fields

Returns

  • A NamedTuple with the specified names and values
SindbadUtils.mergeNamedTuple Method

Merges algorithm options by combining default options with user-provided options.

This function takes two option dictionaries and combines them, with user options taking precedence over default options.

Arguments

  • def_o: Default options object (NamedTuple/Struct/Dictionary) containing baseline algorithm parameters

  • u_o: User options object containing user-specified overrides

Returns

  • A merged object containing the combined algorithm options
SindbadUtils.nonUnique Method
julia
nonUnique(x::AbstractArray{T}) where T

Finds and returns a vector of duplicate elements in the input array.

Arguments:

  • x: The input array.

Returns:

A vector of duplicate elements.

SindbadUtils.removeEmptyTupleFields Method
julia
removeEmptyTupleFields(tpl::NamedTuple)

Remove all empty fields from a NamedTuple.

Arguments

  • tpl: The input NamedTuple

Returns

  • A new NamedTuple with empty fields removed
SindbadUtils.replaceInvalid Method
julia
replaceInvalid(_data, _data_fill)

Replaces invalid numbers in the input with a specified fill value.

Arguments:

  • _data: The input number.

  • _data_fill: The value to replace invalid numbers with.

Returns:

The input number if valid, otherwise the fill value.

SindbadUtils.setLogLevel Method
julia
setLogLevel(log_level::Symbol)

Sets the logging level to the specified level.

Arguments:

  • log_level: The desired logging level (:debug, :warn, :error).
SindbadUtils.setLogLevel Method
julia
setLogLevel()

Sets the logging level to Info.

SindbadUtils.setTupleField Method
julia
setTupleField(tpl, vals)

Set a field in a NamedTuple.

Arguments

  • tpl: The input NamedTuple

  • vals: Tuple containing field name and value

Returns

  • A new NamedTuple with the updated field
SindbadUtils.setTupleSubfield Method
julia
setTupleSubfield(tpl, fieldname, vals)

Set a subfield of a NamedTuple.

Arguments

  • tpl: The input NamedTuple

  • fieldname: The name of the field to set

  • vals: Tuple containing subfield name and value

Returns

  • A new NamedTuple with the updated subfield
SindbadUtils.sindbadBanner Function
julia
sindbadBanner(disp_text="SINDBAD")

Displays the given text as a banner using Figlets.

Arguments:

  • disp_text: The text to display (default: "SINDBAD").

  • c_olor: Whether to display the text in random colors (default: false).

SindbadUtils.stackArrays Method
julia
stackArrays(arr)

Stacks a collection of arrays along the first dimension.

Arguments:

  • arr: A collection of arrays to be stacked. All arrays must have the same size along their non-stacked dimensions.

Returns:

  • A single array where the input arrays are stacked along the first dimension.

  • If the arrays are 1D, the result is a vector.

Notes:

  • The function uses hcat to horizontally concatenate the arrays and then creates a view to stack them along the first dimension.

  • If the first dimension of the input arrays has a size of 1, the result is flattened into a vector.

  • This function is efficient and avoids unnecessary data copying.

SindbadUtils.tabularizeList Method
julia
tabularizeList(_list)

Converts a list or tuple into a table using TypedTables.

Arguments:

  • _list: The input list or tuple.

Returns:

A table representation of the input list.

SindbadUtils.tcPrint Method
julia
tcPrint(d; _color=true, _type=true, _value=true, t_op=true)

Print a formatted representation of a data structure with type annotations and colors.

Arguments

  • d: The object to print

  • _color: Whether to use colors (default: true)

  • _type: Whether to show types (default: false)

  • _value: Whether to show values (default: true)

  • _tspace: Starting tab space

  • space_pad: Additional space padding

Returns

  • Nothing (prints to console)
SindbadUtils.toUpperCaseFirst Function
julia
toUpperCaseFirst(s::String, prefix="")

Converts the first letter of each word in a string to uppercase, removes underscores, and adds a prefix.

Arguments:

  • s: The input string.

  • prefix: A prefix to add to the resulting string (default: "").

Returns:

A Symbol with the transformed string.

SindbadUtils.toggleStackTraceNT Function
julia
toggleStackTraceNT(toggle=true)

Modifies the display of stack traces to reduce verbosity for NamedTuples.

Arguments:

  • toggle: Whether to enable or disable the modification (default: true).
SindbadUtils.valToSymbol Method
julia
valToSymbol(val)

Returns the symbol corresponding to the type of the input value.

Arguments:

  • val: The input value.

Returns:

A Symbol representing the type of the input value.

Internal

Base.getindex Method
julia
Base.getindex(a::TimeAggregatorViewInstance, I::Vararg{Int, N})

extend the getindex function for TimeAggregatorViewInstance type

Base.size Method
julia
Base.size(a::TimeAggregatorViewInstance, i)

extend the size function for TimeAggregatorViewInstance type

Base.view Method
julia
Base.view(x::AbstractArray, v::TimeAggregator; dim = 1)

extend the view function for TimeAggregatorViewInstance type

Arguments:

  • x: input array to be viewed

  • v: time aggregator struct with indices and function

  • dim: the dimension along which the aggregation should be done

SindbadUtils.collectColorForTypes Method
julia
collectColorForTypes(d; _color = true)

Collect colors for all types from nested namedtuples.

Arguments

  • d: The input data structure

  • _color: Whether to use colors (default: true)

Returns

  • A dictionary mapping types to color codes
SindbadUtils.getIndexForSelectedYear Method
julia
getIndexForSelectedYear(years, sel_year)

a helper function to get the indices of the first year from the date vector

SindbadUtils.getIndicesForTimeGroups Method
julia
getIndicesForTimeGroups(groups)

a helper function to get the indices of the date group of the time series

SindbadUtils.getTimeAggrArray Method
julia
getTimeAggrArray(_dat::AbstractArray{T, 2})

a helper function to instantiate an array from the TimeAggregatorViewInstance for N-dimensional array

SindbadUtils.getTimeArray Method
julia
getTimeArray(ar, ::TimeSizedArray || ::TimeArray)

a helper function to get the array of indices

Arguments:

  • ar: an array of time

  • array type: a type defining the type of array to be returned

    • ::TimeSizedArray: indices as static array

    • ::TimeArray: indices as normal array

SindbadUtils.getTypeOfTimeIndexArray Function
julia
getTypeOfTimeIndexArray(_type=:array)

a helper functio to easily switch the array type for indices of the TimeAggregator object

SindbadUtils.getTypes! Method
julia
getTypes!(d, all_types)

Collect all types from nested namedtuples.

Arguments

  • d: The input data structure

  • all_types: Array to store collected types

Returns

  • Array of unique types found in the data structure
SindbadUtils.getdim Method
julia
getdim(a::TimeAggregatorViewInstance{<:Any, <:Any, D})

get the dimension to aggregate for TimeAggregatorViewInstance type

SindbadUtils.mergeNamedTupleSetValue Function
julia
mergeNamedTupleSetValue(o, p, v)

Set a field in an options object.

Arguments

  • o: The options object (NamedTuple or mutable struct)

  • p: The field name to update

  • v: The new value to assign

Variants:

  1. For NamedTuple options:
  • Updates the field in an immutable NamedTuple by creating a new NamedTuple with the updated value.

  • Uses the @set macro for immutability handling.

  1. For mutable struct options (e.g., BayesOpt):
  • Directly updates the field in the mutable struct using Base.setproperty!.

Returns:

  • The updated options object with the specified field modified.

Notes:

  • This function is used internally by mergeNamedTuple to handle field updates in both mutable and immutable options objects.

  • Ensures compatibility with different types of optimization algorithm configurations.

Examples:

  1. Updating a NamedTuple:
julia
options = (max_iters = 100, tol = 1e-6)
updated_options = mergeNamedTupleSetValue(options, :tol, 1e-8)
  1. Updating a mutable struct:
julia
mutable struct BayesOptConfig
    max_iters::Int
    tol::Float64
end
config = BayesOptConfig(100, 1e-6)
updated_config = mergeNamedTupleSetValue(config, :tol, 1e-8)
SindbadUtils.temporalAggregation Function
julia
temporalAggregation(dat::AbstractArray, temporal_aggregator::TimeAggregator, dim = 1)

a temporal aggregation function to aggregate the data using a given aggregator when the input data is an array

Arguments:

  • dat: a data array/vector to aggregate with function for the following types:

    • ::AbstractArray: an array

    • ::SubArray: a view of an array

    • ::Nothing: a dummy type to return the input and do no aggregation data

  • temporal_aggregator: a time aggregator struct with indices and function to do aggregation

  • dim: the dimension along which the aggregation should be done