Excel Database Functions: How to Use Them [Tutorial Video] (21:22)
In this lesson, you’ll learn how to query and aggregate data tables more effectively with Database Functions such as DSUM and DCOUNT – and you’ll understand how to enter complex sets of AND and OR conditions to do so.
Excel database functions fix some, but not all, of the problems with long SUMIFS and SUMPRODUCT formulas.
For example, formulas that use SUMIFS and SUMPRODUCT are often difficult to enter and understand, sometimes dates don’t work correctly, and it’s difficult to set up criteria such as orders that match a certain date range AND a certain Sales Rep but NOT a certain industry.
Database Functions Definition: Excel database functions allow you to sum and count data in spreadsheets based on AND and OR conditions that join together multiple sets of criteria, such as dates, amounts, geographies, industries, and product identifiers.
Database functions do require more time to set up because you must create extra rows in the spreadsheet, and you must get the syntax exactly right, or nothing will work.
A basic Database function setup might look like this:
Some of the key Excel database functions include the following:
=DSUM: Add numbers in a field (i.e., table column) that match specific conditions
=DCOUNT: Count # cells in a field (i.e., a table column) that match specific conditions
=DCOUNTA: Same, but only for nonblank cells
=DGET: Extracts single row from table that matches specific conditions
How Database Functions Work
A few notes on the functionality in the screenshot above:
1) The order of the fields in this area must be the same as the order of the fields in the table itself (Order_Table). If Order Date follows Amount in the table, Order Date must follow Amount here:
2) You do NOT need to use all the fields in the table – you can choose any of them. We only use a few fields from the table here.
3) And you can repeat the same field multiple times as long as the repeats are all entered after the first entry, with nothing in between.
4) In a single row, the conditions are joined with AND. In other words, for the first row here, matching entries must be in the “Industrials” industry, the “Midwest” region, AND between January 1, 2021 and December 31, 2024.
5) Multiple rows are joined with OR. So, in the example above, matching entries must be in the “Industrials” industry, the “Midwest” region, AND between January 1, 2021 and December 31, 2024… OR they must be in the “Energy” industry, the “Northeast” region, AND between January 1, 2021 and December 31, 2024.
6) The last part – Summary!$J$6:$P$8 – cannot include empty rows, or Database functions such as DSUM will sum up the entire table.
Database functions are most useful for writing queries with complex criteria, such as ones that involve multiple dates, amounts, regions, industries, and more.
They tend to be overkill for simple queries, such as summing up all orders between two dates, or summing up all orders in a certain industry; for those, SUMIFS and SUMPRODUCT are fine.
The Most Common Functions
The most common Database function is DSUM, which looks at all the rows in a table and sums up a specific field (column) in each matching row:
=DSUM(Database, Field, Criteria)
The Database part is the range of cell references, which must contain the field names in the header. This range should ideally be a Data Table, but it can be any range of cells in the spreadsheet.
The Field part is the column or field you want to add up or count. You can use a column number or the exact name in text; it’s best to create a direct link to the title in the table header to avoid errors.
The Criteria part is the smaller range we created above: Summary!$J$6:$P$8.
It must include at least one field name that’s in the Database AND at least one other condition to be evaluated. If it does not, DSUM will sum up everything in the table.
Each function mentioned above – DSUM, DCOUNT, DCOUNTA, and DGET – accepts these same inputs: Database, Field, and Criteria.
All the normal operators, such as <, >, <=, >=, <>, =, *, and ?, still work in the Criteria range of cells, but only if they make logical sense for the underlying data.
The biggest problem with Database functions is that it’s extremely easy to make a small mistake that results in the entire function not working.
Why did these functions suddenly stop working? Because we hard-coded the “Region” field in the header and added an extra space at the end:
This problem shows you why it’s best to link directly to the field names in the Order Table here.
Besides this issue, another drawback of database functions is that it’s not easy to see trends and patterns when you use them.
For example, how have sales to Energy companies changed over 5-10 years? What percentage of total sales did they represent each year, and which sales reps were most responsible?
For those types of tasks, you are better off using pivot tables and related functionality, such as Power Pivot, which we cover in our Excel & Fundamentals course.
Excel Database Functions Multiple Criteria
If you want to set up a function with multiple criteria, remember the rules above: individual fields in a row are joined together with an AND, and individual rows are joined together with an OR.
Therefore, the following function:
Will do the following:
1) First, it will find and add all Order Amounts in the Industrials industry AND in the Midwest AND placed between January 1, 2021 and December 31, 2024 (inclusive).
2) Then, it will find and add all Order Amounts in the Industrials industry AND in the Northeast AND placed between January 1, 2021 and December 31, 2024 (inclusive).
3) Next, it will find and add all Order Amounts in the Energy industry AND in the Midwest AND placed between January 1, 2021 and December 31, 2024 (inclusive).
4) Finally, it will find and add all Order Amounts in the Energy industry AND in the Northeast AND placed between January 1, 2021 and December 31, 2024 (inclusive).
Since each row is joined with an OR, the function will then add up all these Order Amounts.
When you enter multiple rows of criteria, you always get results that match any of the rows.
Database functions have their uses, but we tend to use them less than sensitivity analysis in financial models and less than pivot tables in data analysis.