SQL Subqueries

Writing subqueries is an important skill to develop if you want to work effectively with your database resources.

EXAMPLES

One of the first subqueries you will probably write will be within the where statement to limit results.

If you need to display the value used to filter you can use a subquery within your select statement. Here we calculate the field ‘avgamt’ using a scalar subquery, i.e. it will return the same value on every line.

A correlated subquery is one of your more complex query structures. Here we have a subquery within our select statement that uses a value returned by the main query to the customer master table to count the number of invoices. The field ‘total_trx’

Understanding how to write effective subqueries is the first step in automating custom processes in your database. Managing transaction lists in your system, automated archiving and monthly sales reporting become much easier when writing subqueries in SQL become second nature.

If you’re interested in learning more about scalar subqueries, multiple row subqueries, and correlated subqueries check out this page.

To speak to an expert about your reporting headaches email [email protected].

Query Methods

THE POWER OF BI

Investing in business intelligence automation can help businesses make better decisions, improve efficiency, and ultimately increase profitability. By leveraging the power of data, businesses can gain a deeper understanding of their operations, customers, and markets. With this knowledge, they can make more informed decisions, optimize their processes, and uncover new opportunities for growth.

Business intelligence automation takes this to the next level by automating the data collection, analysis, and reporting process. This means that businesses can get real-time insights into their operations, without spending hours manually collecting and analyzing data. With the help of automation, businesses can identify trends, patterns, and insights that might have gone unnoticed otherwise, and use this information to make better decisions.

Business intelligence automation also improves efficiency by streamlining processes and reducing the amount of time and resources required to collect and analyze data. By automating routine tasks, businesses can free up their staff to focus on higher-level strategic activities. This can result in cost savings and increased productivity.

Ultimately, investing in business intelligence automation can lead to increased profitability. By improving decision-making, optimizing processes, and identifying new opportunities, businesses can boost their revenue and reduce costs. This can have a significant impact on the bottom line and position businesses for long-term success.

Investing in business intelligence automation is an investment in the future success of your business. By unlocking the power of your data and streamlining your processes, you can make better decisions, improve efficiency, and increase profitability.

Make better decisions, improve efficiency, and increase profitability. Don’t miss out on the opportunity to unlock the power of your data, email [email protected].

ManchesterIntegratedSolutions.com

March 17, 2023March 17, 2023

STAR SCHEMA vs SNOWFLAKE

There’s been a lot of talk about star schemas and snowflake schemas at SQLbits this week. If you’re not familiar with the terms they relate to two divergent approaches to data that are important to consider when moving to a cloud based reporting solution.

WHAT IS IT?

A star schema is a simple and denormalized structure where the fact table is at the center and surrounded by dimension tables. It looks like a star because of its shape.

On the other hand, a snowflake schema is a more normalized structure that breaks down dimension tables into sub-dimension tables. This makes the schema look like a snowflake, hence the name. This allows for better scalability and more complex relationships between tables. However, it can also make querying data more complicated and slower due to the increased number of joins required.

STAR vs SNOWFLAKE
https://www.researchgate.net/figure/Star-schema-left-vs-Snowflake-schema-right_fig4_227246694 — STAR vs SNOWFLAKE

The star schema is simpler and faster for querying data, but less flexible, while the snowflake schema is more flexible and scalable, but can be more complex and slower for querying data. Both of these approaches have a purposes and a strength and in my experience a combination of both approaches is often best.

Snowflake is the standard in older systems, or in highly controlled transactional systems. It used to be taboo to list a value redundantly in two places. Star schemas are more likely to become out of sync, because you have to update values in multiple locations. Over time; however, it’s clear that multiple joins in ad hoc querying is inefficient, especially when we’re reporting on a large amount of data, and particularly in tables that are actively processing transactions.

MODERN REPORTING FOR LEGACY SYSTEMS

Preparing your data for consumption by converting legacy snowflake schema tables to a simplified star schema set of tables that users can query without locking or blocking transactional tables makes the most of both design approaches. Talk to one of our data experts to discuss how your data factory or warehouse resources can be simplified and turned into useful, meaningful business intelligence using modern storage solutions along with an intelligent and efficient approach to design that will optimize scanning and extraction.

You can’t make the most of the modern tools available by simply importing your legacy data into a data lake without some tweaks with the help of a reporting expert, and not just a . Get the advice of someone who understands the pitfalls. Keep your queries efficient to keep your costs low, and watch your data turn into meaningful action.

Email today for help with your data optimization questions.

Manchester Integrated Solutions

March 14, 2023March 16, 2023

COSMOS DB AND SQL

THE EVOLUTION OF DB DESIGN

The evolution of database design in Cosmos DB forces us to rethink our traditional relational structure.

The biggest cost factor in databases used to be storage, but today latency and reliability across geographic locations are bigger concerns.

In a typical relational database you have a table for customers and a table for customer addresses. Your primary key in one table is a foreign key in another.

These structures ensure that your master data tables or transactional tables take up as little space as possible. They are easier to search, easier to backup and restore. In Azure Cosmos DB the cost for storage is nominal, and Cosmos indexes everything! Now the bigger concern is designing your data structure for throughput. The key is to minimize the effort required to report on your data to minimize the request units (RUs) used. Now we think about references or embedded data. We have to think about how people will use our data.

QUESTIONS TO ASK

The first question we have to ask is what is our partition based on? What is an important unique record? This will determine when we need a new document.

Next, how can we deliver everything we need from this single document? We can embed up to 2mb worth of data in a single document. Embedding data in a single document minimizes touchpoints for your query, and reduces your costs.

TECHNICAL DOCUMENTATION

For more about Containers and Documents in Cosmos DB see the documentation online from Microsoft.

Partitioning and horizontal scaling – Azure Cosmos DB | Microsoft Learn.

In a nutshell containers are like tables, and documents are like table records, but you can embed a lot in a document. Cosmos DB documents are complete records. This is in contrast to SQL table records which usually join with other records to provide any useful information.

OTHER CONSIDERATIONS

We have to think about how we update records everywhere a value is referenced since we are embedding values in multiple locations.
Cosmos DB does not support joins. This seems like a significant limitation to convert most existing processes, as relational databases are heavily dependent on joins.
Cosmos DB does not support t-sql, it uses .NET, Java, Python, Ruby, JavaScript. It is intended to speak the language of the web, for always available cloud based applications.
The output is not immediately available as a table view, there is a Table API available for Cosmos DB to allow you to see your data in a tabular format.

COSTS – COSMOS DB CAPACITY CALCULATOR

Cosmos DB is designed for web ready applications that require low latency, flexible, scalable data. When we discuss optimizing for throughput to minimize costs, this should not be a significant concern as long as you have a partner that can help you with your technical design. The costs involved in request units are minimal, there are a few different costing models that should be discussed with your Microsoft partner. The serverless model includes pricing $0.25/ 1million request units (RU), a well designed READ command should cost 1 RU, 2.8 RU for a query, and 7-9 RU for a write command. If you are querying across multiple containers it may become more costly, and you should be cautious of services running queries every few seconds which would become expensive.

Use the online capacity calculator to help calculate the estimated monthly costs at https://cosmos.azure.com/capacitycalculator .

SUMMARY

Cosmos DB and SQL are different databases that store data in different formats and use different languages. Cosmos DB is best for cloud-based applications that rely on high scaling and are always available. SQL databases are best for structured, transactional applications that require data consistency and well-defined schemas.

For help with your data project email MIS today [email protected]

manchesterintegratedsolutions.com