Last_value and first_value functions
Formula syntax
The basic syntax of the last_value function is: last_value(operation(measure or attribute),query_groups(),{column to order by})
. For example: last_value(sum(balance),query_groups(),{transaction date})
.
The first argument, "sum(balance)" identifies the aggregate type and the measure column. The second argument, "query_groups()", partitions the data by column list; Use query_groups so that all columns in the search bar are included. The final argument, "{transaction date}", orders the results by that column.
Note that last_value
and first_value
functions support the query_groups() + ()
and query groups() - ()
arguments like group aggregate functions do. You can enter multiple columns in the third argument. For example, query_groups()+{partner_id}
would partition by all grouping columns in the search and ensure that partner_id is always included. query_groups()-{unit price, sales price}
partitions by all grouping columns in the search except unit price and sales price.
The inner aggregation within the first argument is specified within the last_value or first_value function, for example, last_value(sum(balance),…)
. The aggregation type, in this case, sum, determines how values are aggregated within partitions. If a non-aggregate value is returned, you can use a min or max aggregation.
You can include an outer aggregate function wrapping around the last_value/first_value function. If you don’t supply an outer aggregate, the aggregation defaults to SUM for numeric columns and MAX for non-numeric columns. You can use simple aggregates like sum, min, max, average, or advanced aggregates like group_average, moving_sum, and cumulative_sum.
If you do not specify an outer aggregate function in your formula, you can also add it from the search bar by entering "max", "min", "sum", etc..
Formula examples
Consider the following example formulas:
-
last_value(sum(stock_balance), query_groups(), {snapshot date})
This formula determines the last recorded inventory level for each product in a store. Retailers need to report stock balances daily but should not sum stock levels over time. -
last_value(sum(credit_balance), query_groups(), {transaction date})
This formula retrieves the last credit card balance per customer at the end of the year. Banks analyze the year-end balance instead of summing all daily balances, ensuring accuracy in financial reports. -
last_value(sum(employee_count), query_groups(), {payroll_date})
This formula finds the number of employees as of the last recorded payroll date per department. HR teams track active employees as of a given date to monitor workforce trends. -
last_value(sum(patient_count), query_groups(), {admission_date})
This formula calculates the number of patients in a hospital as of the last recorded day. Hospitals monitor daily inpatient count rather than summing patient admissions over time. -
last_value(sum(active_users), query_groups(), {snapshot date})
This formula determines the last active user count for a product subscription at the end of the quarter. Companies track users per billing cycle instead of summing logins over time. -
last_value(max(event_date), {client_id}, {event_date})
This formula determines the last date of user interaction. With this value, you can calculate additional calculations such as the number of days since this value. You can then use the calculated values in attrition analysis. note that query sets are an alternative to this use case. -
last_value(sum(occupied_capacity), query_groups(), {reporting date})
This formula is used to identify the last known occupied storage space in a warehouse by section. Warehouse managers track real-time storage usage rather than summing daily occupancy. -
last_value(sum(order_value), query_groups(), {order_date})
This formula retrieves the last order amount placed by each customer. Businesses analyze the most recent purchase behavior instead of total spending over time.
Advanced calculation examples
-
moving_average(last_value(sum(balance), query_groups(), {transaction date}), 3, -1, transaction date)
This formula finds the rolling three-month average of the last recorded balances. -
group_aggregate(sum(last_value(sum(balance), query_groups(), {transaction date})), {region}, query_filters()-{region})
This formula calculates the balance at regional level while preserving query filters.
Last_value example use case
In the following use case, consider a data set that receives a snapshot of products each day. There are no missing dates in the time series, all attribute (date/product/store) combinations are represented, and zero values are explicitly recorded rather than being null or missing. For this example, we will use the following closing balance formula:
fxClosingBalance = last_value(sum(balance), query_groups(), {date})
The following table shows the sample data from January 1st to the present day, March 15th.
Date | Product | Balance |
---|---|---|
1st January |
Apples |
10 |
1st January |
Pears |
10 |
1st January |
Grapes |
0 |
2nd January |
Apples |
10 |
2nd January |
Pears |
5 |
2nd January |
Grapes |
5 |
… |
… |
… |
31st January |
Apples |
5 |
31st January |
Pears |
10 |
31st January |
Grapes |
15 |
1st February |
Apples |
10 |
1st February |
Pears |
5 |
1st February |
Grapes |
20 |
… |
… |
… |
28th February |
Apples |
20 |
28th February |
Pears |
5 |
28th February |
Grapes |
0 |
… |
… |
… |
15th March |
Apples |
20 |
15th March |
Pears |
5 |
15th March |
Grapes |
0 |
To find the current inventory balance, you can search for fxClosingBalance
. The formula results in 25. The result is the sum of all balances on the last recorded date, March 15th.
If you search for fxClosingBalance by product
, the following table results:
Product |
---|
Balance |
Apples |
20 |
Pears |
5 |
Grapes |
0 |
The result is the same, 25.
If you search for fxClosingBalance by product monthly
, you receive the final balance for each month, 30 for January, 25 for February, and 25 for March. The following table results:
January | February | March | |
---|---|---|---|
Apples |
5 |
20 |
20 |
Grapes |
15 |
0 |
0 |
Pears |
10 |
5 |
5 |
30 |
25 |
25 |
Note that the date for March is the 15th. The value does not have to be the absolute last date of the period.
If you search for fxClosingBalance by product yearly
, you receive a single value, 25, because the data set only contains one year’s data.
The following table results:
Year | Product | Balance |
---|---|---|
2025 |
Apples |
20 |
2025 |
Pears |
5 |
2025 |
Grapes |
0 |
The final date is March 15th, so the formula returns the product balances for that date, 25.
First_value use case
For these examples, we will be using the following formula to find the opening balance of a data set:
fxOpeningBalance = first_value(sum(balance), query_groups(), {date})
For this example, we will be examining a data set with incomplete data. In the following table, a new product, oranges, was introduced on March 15th. There are no entries for oranges in the previous months, so not all combinations of date and products have been defined.
Date | Product | Balance |
---|---|---|
1st January |
Apples |
10 |
1st January |
Pears |
10 |
1st January |
Grapes |
0 |
2nd January |
Apples |
10 |
2nd January |
Pears |
5 |
2nd January |
Grapes |
5 |
… |
… |
… |
31st January |
Apples |
5 |
31st January |
Pears |
10 |
31st January |
Grapes |
15 |
1st February |
Apples |
10 |
1st February |
Pears |
5 |
1st February |
Grapes |
20 |
… |
… |
… |
28th February |
Apples |
20 |
28th February |
Pears |
5 |
28th February |
Grapes |
0 |
… |
… |
… |
15th March |
Apples |
20 |
15th March |
Pears |
5 |
15th March |
Grapes |
0 |
15th March |
Oranges |
10 |
To find the opening inventory balance, you can search for fxOpeningBalance
, for which you will receive a value of 20, the sum of all balances on January 1st.
If you search for fxOpeningBalance by product
, however, you will receive an answer of 30. Reference the following table:
Product | Balance |
---|---|
Apples |
10 |
Pears |
10 |
Grapes |
0 |
Oranges |
10 |
Although the first date in the table is January 1st, the value for oranges on March 15th is included in the result, as it is the first value for this product.
In order to fix this dataset to prevent erroneous values in the first_value function, you need to include zero value balances for oranges for the preceding time periods.
Date | Product | Balance |
---|---|---|
1st January |
Apples |
10 |
1st January |
Pears |
10 |
1st January |
Grapes |
0 |
1st January |
Oranges |
0 |
2nd January |
Apples |
10 |
2nd January |
Pears |
5 |
2nd January |
Grapes |
5 |
… |
… |
… |
31st January |
Apples |
5 |
31st January |
Pears |
10 |
31st January |
Grapes |
15 |
1st February |
Apples |
10 |
1st February |
Pears |
5 |
1st February |
Grapes |
20 |
1st February |
Oranges |
0 |
… |
… |
… |
28th February |
Apples |
20 |
28th February |
Pears |
5 |
28th February |
Grapes |
0 |
… |
… |
… |
15th March |
Apples |
20 |
15th March |
Pears |
5 |
15th March |
Grapes |
0 |
15th March |
Oranges |
10 |
Now, if you search for fxOpeningBalance by product
, you receive a value of 20. The zero value balance for oranges on January 1st is included.
Product | Balance |
---|---|
Apples |
10 |
Pears |
10 |
Grapes |
0 |
Oranges |
0 |
Late-arriving snapshot data example
In the following scenario, the snapshot data fails the principle that all dates are represented. Data such as store inventory may arrive on different dates. The business considers the last store snapshot to be valid until a more recent one arrives.
The following is a subset of the snapshot table. Note that the most recent transaction date is March 15th. However, the snapshot has not been received for all stores on this date.
Date | Store | Product | Balance |
---|---|---|---|
14th March |
Kirkwood |
Apples |
10 |
14th March |
Kirkwood |
Pears |
5 |
14th March |
Kirkwood |
Grapes |
0 |
15th March |
Park City |
Apples |
20 |
15th March |
Park City |
Pears |
10 |
15th March |
Park City |
Grapes |
10 |
To find out the closing balance for this data, consider the following formula:
fxClosingBalance = last_value(sum(balance), query_groups(), {date})
If you search for fxClosingBalance
, you get a result of 40, the sum of all balances on the last date, March 15th. This does not include the values for Kirkwood, since the last date for that data is March 14th.
If you then search for fxClosingBalance by store
, you get a result of 55. The value for Kirkwood is included, as the search breaks the result down by store.
Store |
---|
Balance |
Kirkwood |
15 |
Park City |
40 |
In order to ensure the total values remain consistent from an aggregate to detail, the formula must be augmented to ensure that Store is always included as a partitioning column.
Use the following formula:
fxClosingBalance = last_value(sum(balance), query_groups()+{Store}, {date})
Now, when you search for fxClosingBalance
or fxClosingBalance by product
, you get a result of 55 values for both searches.
Limitations
-
Semi-additive functions cannot span multiple fact tables.
-
Semi-additive functions cannot contain only constant expressions. For example, if you create a formula with no references to a column, such as
last_value(sum(1), {}, {true})
, ThoughtSpot will not support the function. -
You cannot combine different partitioning and ordering clauses in different semi-additive functions from the same table, in the same query. That is, a case where formula 1 partitions on Date and Product, and formula 2 partitions on Date, Product, and Client.
-
Average, Variance, Standard Deviation and Unique Count do not work with semi-additive functions across an attribution query. That is, multiple fact tables with at least one non-shared attribute.
-
We do not support semi-additive functions and unique count functions from the same table. Note that a work-around exists by wrapping the unique count in a group_aggregate function. For example,
group_aggregate(unique_count(product),query_groups()+{},query_filters())
. -
Advanced aggregates (group, cumulative, moving and rank) cannot be used within the definition of semi-additive functions. Note they can be used to wrap these functions as outer aggregation.
-
ThoughtSpot will support
first_value
andlast_value
functions for Redshift and Google BigQuery beginning in the 10.1.0.cl release.
Related information