SindbadUtils.SindbadUtils Module
SindbadUtils
The SindbadUtils
package provides a collection of utility functions and tools for handling data, managing NamedTuples, and performing spatial and temporal operations in the SINDBAD framework. It serves as a foundational package for simplifying common tasks and ensuring consistency across SINDBAD experiments.
Purpose:
This package is designed to provide reusable utilities for data manipulation, statistical operations, and spatial/temporal processing.
Dependencies:
Sindbad
: Provides the core SINDBAD models and types.Crayons
: Enables colored terminal output, improving the readability of logs and messages.StyledStrings
: Provides styled text for enhanced terminal output.Dates
: Facilitates date and time operations, useful for temporal data processing.FIGlet
: Generates ASCII art text, useful for creating visually appealing headers in logs or outputs.Logging
: Provides logging utilities for debugging and monitoring SINDBAD workflows.
Included Files:
getArrayView.jl
:
- Implements functions for creating views of arrays, enabling efficient data slicing and subsetting.
utils.jl
:
- Contains general-purpose utility functions for data manipulation and processing.
utilsNT.jl
:
- Provides utilities for working with NamedTuples, including transformations and access operations.
utilsTemporal.jl
:
- Handles temporal operations, including time-based filtering and aggregation.
Exported
SindbadUtils.addPackage Method
addPackage(where_to_add, the_package_to_add)
Adds a specified Julia package to the environment of a given module or project.
Arguments:
where_to_add
: The module or project where the package should be added.the_package_to_add
: The name of the package to add.
Behavior:
Activates the environment of the specified module or project.
Checks if the package is already installed in the environment.
If the package is not installed:
Adds the package to the environment.
Removes the
Manifest.toml
file and reinstantiates the environment to ensure consistency.Provides instructions for importing the package in the module.
Restores the original environment after the operation.
Notes:
This function assumes that the
where_to_add
module or project is structured with a standard Julia project layout.It requires the
Pkg
module for package management, which is re-exported from core Sindbad.
Example:
addPackage(MyModule, "DataFrames")
SindbadUtils.booleanizeArray Method
booleanizeArray(_array)
Converts an array into a boolean array where elements greater than zero are true
.
Arguments:
_array
: The input array to be converted.
Returns:
A boolean array with the same dimensions as _array
.
SindbadUtils.createTimeAggregator Function
createTimeAggregator(date_vector, t_step, aggr_func = mean, skip_aggregation = false)
a function to create a temporal aggregation struct for a given time step
Arguments:
date_vector
: a vector of datetime objects that determine the index of the array to be aggregatedt_step
: a string/Symbol/Type defining the aggregation time target with different types as follows:::Union{String, Symbol}
: a string/Symbol defining the aggregation time target from the settings
aggr_func
: a function to use for aggregation, defaults to meanskip_aggregation
: a flag indicating if the aggregation target is the same as the input data and the aggregation can be skipped, defaults to false
Returns:
::Vector{TimeAggregator}
: a vector of TimeAggregator structs
t_step:
TimeAggregation
Abstract type for time aggregation methods in SINDBAD
Available methods/subtypes:
TimeAllYears
: aggregation/slicing to include all yearsTimeArray
: use array-based time aggregationTimeDay
: aggregation to daily time stepsTimeDayAnomaly
: aggregation to daily anomaliesTimeDayIAV
: aggregation to daily IAVTimeDayMSC
: aggregation to daily MSCTimeDayMSCAnomaly
: aggregation to daily MSC anomaliesTimeDiff
: aggregation to time differences, e.g. monthly anomaliesTimeFirstYear
: aggregation/slicing of the first yearTimeHour
: aggregation to hourly time stepsTimeHourAnomaly
: aggregation to hourly anomaliesTimeHourDayMean
: aggregation to mean of hourly data over daysTimeIndexed
: aggregation using time indices, e.g., TimeFirstYearTimeMean
: aggregation to mean over all time stepsTimeMonth
: aggregation to monthly time stepsTimeMonthAnomaly
: aggregation to monthly anomaliesTimeMonthIAV
: aggregation to monthly IAVTimeMonthMSC
: aggregation to monthly MSCTimeMonthMSCAnomaly
: aggregation to monthly MSC anomaliesTimeNoDiff
: aggregation without time differencesTimeRandomYear
: aggregation/slicing of a random yearTimeShuffleYears
: aggregation/slicing/selection of shuffled yearsTimeSizedArray
: aggregation to a sized arrayTimeYear
: aggregation to yearly time stepsTimeYearAnomaly
: aggregation to yearly anomalies
SindbadUtils.dictToNamedTuple Method
dictToNamedTuple(d::AbstractDict)
Convert a nested dictionary to a NamedTuple.
Arguments
d::AbstractDict
: The input dictionary to convert
Returns
- A NamedTuple with the same structure as the input dictionary
SindbadUtils.doNothing Method
doNothing(dat)
Returns the input as is, without any modifications.
Arguments:
dat
: The input data.
Returns:
The same input data.
SindbadUtils.doTemporalAggregation Function
doTemporalAggregation(dat, temporal_aggregators, aggregation_type)
a temporal aggregation function to aggregate the data using a vector of aggregators
Arguments:
dat
: a data array/vector to aggregatetemporal_aggregators
: a vector of time aggregator structs with indices and function to do aggregationaggregation_type: a type defining the type of aggregation to be done as follows:
::TimeNoDiff
: a type defining that the aggregator does not require removing/reducing values from original time series::TimeDiff
: a type defining that the aggregator requires removing/reducing values from original time series. First aggregator aggregates the main time series, second aggregator aggregates to the time series to be removed.::TimeIndexed
: a type defining that the aggregator requires indexing the original time series
SindbadUtils.dropFields Method
dropFields(namedtuple::NamedTuple, names::Tuple{Vararg{Symbol}})
Remove specified fields from a NamedTuple.
Arguments
namedtuple
: The input NamedTuplenames
: A tuple of field names to remove
Returns
- A new NamedTuple with the specified fields removed
SindbadUtils.entertainMe Function
entertainMe(n=10, disp_text="SINDBAD")
Displays the given text disp_text
as a banner n
times.
Arguments:
n
: Number of times to display the banner (default: 10).disp_text
: The text to display (default: "SINDBAD").c_olor
: Whether to display the text in random colors (default:false
).
SindbadUtils.foldlUnrolled Method
foldlUnrolled(f, x::Tuple{Vararg{Any, N}}; init)
Generate an unrolled expression to run a function for each element of a tuple to avoid complexity of for loops for compiler.
Arguments
f
: The function to applyx
: The tuple to iterate throughinit
: Initial value for the fold operation
Returns
- The result of applying the function to each element
SindbadUtils.getAbsDataPath Method
getAbsDataPath(info, data_path)
Converts a relative data path to an absolute path based on the experiment directory.
Arguments:
info
: The SINDBAD experiment information object.data_path
: The relative or absolute data path.
Returns:
An absolute data path.
SindbadUtils.getArrayView Function
getArrayView(_dat::AbstractArray{<:Any, N}, inds::Tuple{Vararg{Int}}) where N
Creates a view of the input array _dat
based on the provided indices tuple inds
.
Arguments:
_dat
: The input array from which a view is created. Can be of any dimensionality.inds
: A tuple of integer indices specifying the spatial or temporal dimensions to slice.
Returns:
- A
SubArray
view of_dat
corresponding to the specified indices.
Notes:
The function supports arrays of arbitrary dimensions (
N
).For arrays with fewer dimensions than the size of
inds
, an error is thrown.For higher-dimensional arrays, the indices are applied to the last dimensions, while earlier dimensions are accessed using
Colon()
(i.e., all elements are included).This function avoids copying data by creating a view, which is efficient for large arrays.
Error Handling:
- Throws an error if the dimensionality of
_dat
is less than the size ofinds
.
SindbadUtils.getCombinedNamedTuple Method
getCombinedNamedTuple(base_nt::NamedTuple, priority_nt::NamedTuple)
Combine property values from base and priority NamedTuples.
Arguments
base_nt
: The base NamedTuplepriority_nt
: The priority NamedTuple whose values take precedence
Returns
- A new NamedTuple combining values from both inputs
SindbadUtils.getNamedTupleFromTable Method
getNamedTupleFromTable(tbl; replace_missing_values=false)
Convert a table to a NamedTuple.
Arguments
tbl
: The input tablereplace_missing_values
: Whether to replace missing values with empty strings
Returns
- A NamedTuple representation of the table
SindbadUtils.getSindbadDataDepot Method
getSindbadDataDepot(; env_data_depot_var="SINDBAD_DATA_DEPOT", local_data_depot="../data")
Retrieve the Sindbad data depot path.
Arguments
env_data_depot_var
: Environment variable name for the data depot (default: "SINDBAD_DATA_DEPOT")local_data_depot
: Local path to the data depot (default: "../data")
Returns
The path to the Sindbad data depot.
SindbadUtils.getTimeAggregatorTypeInstance Function
getTimeAggregatorTypeInstance(aggr)
Creates and returns a time aggregator instance based on the provided aggregation.
Arguments
aggr::Symbol
: Symbol specifying the type of time aggregation to be performedaggr::String
: String specifying the type of time aggregation to be performed
Returns
An instance of the corresponding time aggregator type.
Notes:
- A similar approach
getTypeInstanceForNamedOptions
is used inSindbadSetup
for creating types of other named option
SindbadUtils.getTupleFromLongTuple Method
getTupleFromLongTuple(long_tuple)
Convert a LongTuple to a regular tuple.
Arguments
long_tuple
: The input LongTuple
Returns
- A regular tuple containing all elements from the LongTuple
SindbadUtils.makeLongTuple Function
makeLongTuple(normal_tuple; longtuple_size=5)
Arguments:
normal_tuple
: a normal tuplelongtuple_size
: size to break down the tuple into
SindbadUtils.makeLongTuple Function
makeLongTuple(normal_tuple; longtuple_size=5)
Create a LongTuple from a normal tuple.
Arguments
normal_tuple
: The input tuple to convertlongtuple_size
: Size to break down the tuple into (default: 5)
Returns
- A LongTuple containing the elements of the input tuple
SindbadUtils.makeNamedTuple Method
makeNamedTuple(input_data, input_names)
Create a NamedTuple from input data and names.
Arguments
input_data
: Vector of data valuesinput_names
: Vector of names for the fields
Returns
- A NamedTuple with the specified names and values
SindbadUtils.mergeNamedTuple Method
Merges algorithm options by combining default options with user-provided options.
This function takes two option dictionaries and combines them, with user options taking precedence over default options.
Arguments
def_o
: Default options object (NamedTuple/Struct/Dictionary) containing baseline algorithm parametersu_o
: User options object containing user-specified overrides
Returns
- A merged object containing the combined algorithm options
SindbadUtils.nonUnique Method
nonUnique(x::AbstractArray{T}) where T
Finds and returns a vector of duplicate elements in the input array.
Arguments:
x
: The input array.
Returns:
A vector of duplicate elements.
SindbadUtils.removeEmptyTupleFields Method
removeEmptyTupleFields(tpl::NamedTuple)
Remove all empty fields from a NamedTuple.
Arguments
tpl
: The input NamedTuple
Returns
- A new NamedTuple with empty fields removed
SindbadUtils.replaceInvalid Method
replaceInvalid(_data, _data_fill)
Replaces invalid numbers in the input with a specified fill value.
Arguments:
_data
: The input number._data_fill
: The value to replace invalid numbers with.
Returns:
The input number if valid, otherwise the fill value.
SindbadUtils.setLogLevel Method
setLogLevel(log_level::Symbol)
Sets the logging level to the specified level.
Arguments:
log_level
: The desired logging level (:debug
,:warn
,:error
).
SindbadUtils.setTupleField Method
setTupleField(tpl, vals)
Set a field in a NamedTuple.
Arguments
tpl
: The input NamedTuplevals
: Tuple containing field name and value
Returns
- A new NamedTuple with the updated field
SindbadUtils.setTupleSubfield Method
setTupleSubfield(tpl, fieldname, vals)
Set a subfield of a NamedTuple.
Arguments
tpl
: The input NamedTuplefieldname
: The name of the field to setvals
: Tuple containing subfield name and value
Returns
- A new NamedTuple with the updated subfield
SindbadUtils.sindbadBanner Function
sindbadBanner(disp_text="SINDBAD")
Displays the given text as a banner using Figlets.
Arguments:
disp_text
: The text to display (default: "SINDBAD").c_olor
: Whether to display the text in random colors (default:false
).
SindbadUtils.stackArrays Method
stackArrays(arr)
Stacks a collection of arrays along the first dimension.
Arguments:
arr
: A collection of arrays to be stacked. All arrays must have the same size along their non-stacked dimensions.
Returns:
A single array where the input arrays are stacked along the first dimension.
If the arrays are 1D, the result is a vector.
Notes:
The function uses
hcat
to horizontally concatenate the arrays and then creates a view to stack them along the first dimension.If the first dimension of the input arrays has a size of 1, the result is flattened into a vector.
This function is efficient and avoids unnecessary data copying.
SindbadUtils.tabularizeList Method
tabularizeList(_list)
Converts a list or tuple into a table using TypedTables
.
Arguments:
_list
: The input list or tuple.
Returns:
A table representation of the input list.
SindbadUtils.tcPrint Method
tcPrint(d; _color=true, _type=true, _value=true, t_op=true)
Print a formatted representation of a data structure with type annotations and colors.
Arguments
d
: The object to print_color
: Whether to use colors (default: true)_type
: Whether to show types (default: false)_value
: Whether to show values (default: true)_tspace
: Starting tab spacespace_pad
: Additional space padding
Returns
- Nothing (prints to console)
SindbadUtils.toUpperCaseFirst Function
toUpperCaseFirst(s::String, prefix="")
Converts the first letter of each word in a string to uppercase, removes underscores, and adds a prefix.
Arguments:
s
: The input string.prefix
: A prefix to add to the resulting string (default: "").
Returns:
A Symbol
with the transformed string.
SindbadUtils.toggleStackTraceNT Function
toggleStackTraceNT(toggle=true)
Modifies the display of stack traces to reduce verbosity for NamedTuples.
Arguments:
toggle
: Whether to enable or disable the modification (default:true
).
SindbadUtils.valToSymbol Method
valToSymbol(val)
Returns the symbol corresponding to the type of the input value.
Arguments:
val
: The input value.
Returns:
A Symbol
representing the type of the input value.
Internal
Base.getindex Method
Base.getindex(a::TimeAggregatorViewInstance, I::Vararg{Int, N})
extend the getindex function for TimeAggregatorViewInstance type
Base.size Method
Base.size(a::TimeAggregatorViewInstance, i)
extend the size function for TimeAggregatorViewInstance type
Base.view Method
Base.view(x::AbstractArray, v::TimeAggregator; dim = 1)
extend the view function for TimeAggregatorViewInstance type
Arguments:
x
: input array to be viewedv
: time aggregator struct with indices and functiondim
: the dimension along which the aggregation should be done
SindbadUtils.collectColorForTypes Method
collectColorForTypes(d; _color = true)
Collect colors for all types from nested namedtuples.
Arguments
d
: The input data structure_color
: Whether to use colors (default: true)
Returns
- A dictionary mapping types to color codes
SindbadUtils.getIndexForSelectedYear Method
getIndexForSelectedYear(years, sel_year)
a helper function to get the indices of the first year from the date vector
SindbadUtils.getIndicesForTimeGroups Method
getIndicesForTimeGroups(groups)
a helper function to get the indices of the date group of the time series
SindbadUtils.getTimeAggrArray Method
getTimeAggrArray(_dat::AbstractArray{T, 2})
a helper function to instantiate an array from the TimeAggregatorViewInstance for N-dimensional array
SindbadUtils.getTimeArray Method
getTimeArray(ar, ::TimeSizedArray || ::TimeArray)
a helper function to get the array of indices
Arguments:
ar
: an array of timearray type: a type defining the type of array to be returned
::TimeSizedArray
: indices as static array::TimeArray
: indices as normal array
SindbadUtils.getTypeOfTimeIndexArray Function
getTypeOfTimeIndexArray(_type=:array)
a helper functio to easily switch the array type for indices of the TimeAggregator object
SindbadUtils.getTypes! Method
getTypes!(d, all_types)
Collect all types from nested namedtuples.
Arguments
d
: The input data structureall_types
: Array to store collected types
Returns
- Array of unique types found in the data structure
SindbadUtils.getdim Method
getdim(a::TimeAggregatorViewInstance{<:Any, <:Any, D})
get the dimension to aggregate for TimeAggregatorViewInstance type
SindbadUtils.mergeNamedTupleSetValue Function
mergeNamedTupleSetValue(o, p, v)
Set a field in an options object.
Arguments
o
: The options object (NamedTuple or mutable struct)p
: The field name to updatev
: The new value to assign
Variants:
- For
NamedTuple
options:
Updates the field in an immutable
NamedTuple
by creating a newNamedTuple
with the updated value.Uses the
@set
macro for immutability handling.
- For mutable struct options (e.g., BayesOpt):
- Directly updates the field in the mutable struct using
Base.setproperty!
.
Returns:
- The updated options object with the specified field modified.
Notes:
This function is used internally by
mergeNamedTuple
to handle field updates in both mutable and immutable options objects.Ensures compatibility with different types of optimization algorithm configurations.
Examples:
- Updating a
NamedTuple
:
options = (max_iters = 100, tol = 1e-6)
updated_options = mergeNamedTupleSetValue(options, :tol, 1e-8)
- Updating a mutable struct:
mutable struct BayesOptConfig
max_iters::Int
tol::Float64
end
config = BayesOptConfig(100, 1e-6)
updated_config = mergeNamedTupleSetValue(config, :tol, 1e-8)
SindbadUtils.temporalAggregation Function
temporalAggregation(dat::AbstractArray, temporal_aggregator::TimeAggregator, dim = 1)
a temporal aggregation function to aggregate the data using a given aggregator when the input data is an array
Arguments:
dat
: a data array/vector to aggregate with function for the following types:::AbstractArray
: an array::SubArray
: a view of an array::Nothing
: a dummy type to return the input and do no aggregation data
temporal_aggregator
: a time aggregator struct with indices and function to do aggregationdim
: the dimension along which the aggregation should be done