python - how to apply function along one dimension and save result as new variable in dataset? -


this question has answer here:

say have dataset this

nx1, nx2, nx3 = 5, 3, 20  ds = xray.dataset() ds.coords.update({'x1': ('x1', range(nx1)),                    'x2': ('x2', range(nx2)),                    'x3': ('x3', range(nx3))})  ds['a'] = (['x1', 'x2', 'x3'], np.random.randn(nx1, nx2, nx3)) ds['b'] = (['x1', 'x2', 'x3'], np.random.randn(nx1, nx2, nx3)) 

and function func takes input variables , b, works along x3 dimension, takes in arrays of shape (nx3,), , outputs array of shape (nx3,). if wanted apply function above dataset , save result new variable named c, following way it,

required_shape = (len(ds.coords['x1']),                    len(ds.coords['x2']),                   len(ds.coords['x3']))  if 'c' not in ds:     ds.update({'c': (['x1', 'x2', 'x3'], np.zeros(required_shape))})  ix1, x1 in enumerate(ds.coords['x1']):     ix2, x2 in enumerate(ds.coords['x2']):         args = dict(x1=ix1, x2=ix2)         = ds['a'][args]         b = ds['b'][args]         c = func(a.values, b.values)         ds['c'][args] = c 

by initialising new array in dataset , using for-loops on other dimensions?

i'm not big on pandas, general solution other data types use comprehension instead, , rid of nested loops , initialization step.

required_shape = (len(ds.coords['x1']),                    len(ds.coords['x2']),                   len(ds.coords['x3']))  ds['c'] = (['x1', 'x2', 'x3'], np.array([     func(ds['a'][args].values, ds['b'][args].values)     ix1, x1 in enumerate(ds.coords['x1'])     ix2, x2 in enumerate(ds.coords['x2'])     args in (dict(x1=ix1, x2=ix2),)]).reshape(required_shape)) 

edit: incidentally,

ds['c'] = func(ds['a'], ds['b']) 

seems work fine, simple function like:

def func(a, b):     return + b 

Comments