DataNinja99

Friday, July 28, 2023

Introduction to Database Management System

A Database Management System (DBMS) is a collection of programs that enables user to create and maintain a database. The DBMS is a general-purpose software system that facilitates the processes of defining, constructing, manipulating and sharing the database among various user and applications.

More About DATA, Information and Knowledge:

Knowledge refers to the useful use of information. As you know that information can be transported, stored, and shared without any problems and difficulties, but the same cannot be said about knowledge. Knowledge necessarily involves personal experience and practice.

Database systems are meant to handle an extensive collection of information. Management of data involves both defining structures for storage of information and providing mechanisms that can do the manipulation of those stored information. Moreover, the database system must ensure the safety of the information stored, despite system crashes or attempts at unauthorized access.

Why to use DBMS?

⦁ To develop software applications in less time.

⦁ Data independence and efficient use of data.

⦁ For uniform data administration.

⦁ For data integrity and security.

⦁ For concurrent access to data, and data recovery from crashes.

⦁ To use user-friendly declarative query language.

Where being DBMS Used?

⦁ Airlines: reservations, schedules, etc

⦁ Telecom: calls made, customer details, network usage, etc

⦁ Universities: registration, results, grades, etc

⦁ Sales: products, purchases, customers, etc

⦁ Banking: all transactions etc

Advantages of DBMS:

A DBMS manages data and has many benefits. These are:

⦁ Data independence: Application programs should be as free or independent as possible from details of data representation and storage. DBMS can supply an abstract view of the data for insulating application code from such facts.

⦁ Efficient data access: DBMS utilizes a mixture of sophisticated concepts and techniques for storing and retrieving data competently. This feature becomes important in cases where the data is stored on external storage devices.

⦁ Data integrity and security: If data is accessed through the DBMS, the DBMS can enforce integrity constraints on the data.

⦁ Data administration: When several users share the data, integrating the administration of data can offer significant improvements. Experienced professionals understand the nature of the data being managed and can be responsible for organizing the data representation to reduce redundancy and make the data to retrieve efficiently.

Components of DBMS:

Users: Users may be of any kind such as DB administrator, System developer, or database users.

Database application: Database application may be Departmental, Personal, organization's and / or Internal.

DBMS: Software that allows users to create and manipulate database access,

Database: Collection of logical data as a single unit.

Friday, April 7, 2023

Top 15 Database Monitoring Tools

Datadog Database Monitoring A cloud-based application monitoring service that includes database performance checks.
ManageEngine Applications Manager Includes monitoring screens for both SQL-based and NoSQL databases. Runs on Windows Server and Linux.
SolarWinds AppOptics APM A comprehensive cloud-based application performance monitor that includes specialized processes for monitoring databases.
Site24x7 Server Monitoring An online monitoring package that includes SQL monitoring and analysis functions.
Paessler PRTG Network Monitor Database monitoring functions are part of this all-in-one network, server, and applications monitor. Runs on Windows Server.
SolarWinds SQL Sentry Live database performance monitoring with automated index defragmentation.
Atera A remote management solution for managed service providers that includes database backup automation and supervision.
dbWatch Database Control A data-base focused tool that unifies monitoring for all databases in an enterprise operated by SQL Server, Oracle, Sybase, MariaDB, MySQL, and Postgres.
Idera SQL Diagnostic Manager A specialist database monitor for MySQL or SQL Server.
AimBetter This SaaS system remotely monitors database performance and includes the services of database experts for SQL Server, Oracle, and SAP.
SQL Power Tools Logs database performance metrics and scans for anomalous behavior to detect any intrusions.
Red-Gate SQL Monitor Real-time database monitor with color-coded statuses and some great data visualizations.
Lepide SQL Server Auditing A database monitor that is prized for its cybersecurity features.
ManageEngine Free SQL Health Monitor A competent free database performance monitor from a leading infrastructure management producer.

The best database monitoring applications

Databases expand. Larger table sizes and more access attempts put strain on sorting resources such as CPU and memory space. A database that appeared to be functioning properly can gradually become cumbersome and slow, lacking sufficient resources to serve all queries; requests are queued, and concurrent activities create mutual resource locks. Constant analytical monitoring can detect evolving resource shortages and access conflicts in order to prevent performance issues.

Our methodology for selecting a database monitoring tool.

We reviewed the database monitoring tools market and analyzed tools based on the following criteria:

The ability to attach to database instances from different DBMSs.
SQL query troubleshooting and optimization
Database resource and server resource monitoring
Alerts for resource shortages and performance deterioration
An easy-to-use interface
Secure access procedures that include authentication and multiple user accounts
A free trial, demo, or money-back period for no-risk assessment
A reasonable price that reflects the quality of the product and offers value for money.

Wednesday, March 29, 2023

Effective User Management performed by DBA"s.

The user or users with the administrator role are solely responsible for managing users who have been granted database access.

The administrator is in charge of controlling how other users in your organization access your database. The administrator, for example, can add new users, deny access to users who have left the organization, and assist users who are unable to log in.

Administrators are responsible for the following tasks:

Create new users
Users should be removed.
Control user access
Configure user connection privileges.
Change user permissions.
View current user permissions.
Passwords for users should be changed.

Database user management typically involves the following tasks:

1. Creating user accounts: The first step in user management is to create user accounts for individuals or groups who need access to the database. Each account is typically assigned a unique username and password.

2. Modifying user accounts: Administrators may need to modify user accounts from time to time, for example, to update passwords or change access permissions.

3. Granting and revoking access permissions: Administrators can assign specific privileges to users, such as read-only access or the ability to make changes to the data. They can also revoke these permissions if necessary.

4. Monitoring user activity: It's important to monitor user activity within the database to identify any suspicious behavior or potential security breaches.

5. Deleting user accounts: When an employee leaves an organization or no longer requires access to a particular database, their user account should be deleted to prevent unauthorized access.

Database Security Best Practices followed by DBA's.

Database security is a combination of controls and countermeasures used to protect a database management system. The best practices cover all aspects of physical and digital data center and information security.

Database security issues stem from a diverse set of security threats. Consider the risks and address each issue before reaching a compromise. The following are examples of common database maintenance issues:

Error due to human error.
Threats from within.
Vulnerabilities in software.
Malware.
Backup assaults.
Physical site security.

Best Database Security Practices performed by DBA's.

The security features protect one or more of the database's following aspects:

The database server's physical location.
The network infrastructure required to access the database.
The information contained within the database.
The database administration system.
Database-related applications.

Database security has a layered structure. The outside layers encapsulate the inner layers of security closest to the data, providing a barrier of protection around the database.

Use Firewalls
Use Secure Applications and Software
Practice Database Encryption
Secure User Access
Perform Regular Updates
Perform Regular Backups
Practice Monitoring
Perform Security Testing
Define Clear Security Procedures
Maintain Hardware Security

Why Is Database Security Critical?

Data security is a high-priority security task. Database security protects data from potentially disastrous situations. The following are some of the consequences of a database breach:

Penalties and fines for noncompliance.
Repairs are expensive.
Intellectual property compromise.
Reputational harm.
Conclusion:

Many security aspects are included in database integrity. This post only covers a few of the best security practices for keeping your information system safe.

Tuesday, March 28, 2023

Common Causes of Slow Queries and Technique to improve the Query Performance Effectively

When you run a query, you may notice that the output is much slower than expected, taking a few seconds, minutes, or even hours to load. What could be causing this?

There are numerous causes for a slow query, but a few common ones are as follows:

The database size, which is made up of the number of tables and the size of each table. The larger the table, the longer the query will take, especially if you're scanning the entire table each time.
Queries that are not optimized can result in slower performance. For example, if your database hasn't been properly indexed, the results of your queries will load much slower.

When you run a query, you should see something like this: Sample SELECT Output

As can be seen, the output includes the number of rows outputted as well as the execution time, which is given in the format 0.00 seconds.

The EXPLAIN statement is a built-in tool that can be used to determine why your query is taking so long to run.

EXPLAIN Your Query's Performance

The EXPLAIN statement provides information about how MySQL executes your statement—that is, how MySQL plans on running your query. With EXPLAIN, you can check if your query is pulling more information than it needs to, resulting in a slower performance due to handling large amounts of data.

This statement works with SELECT, DELETE, INSERT, REPLACE and UPDATE. When run, it outputs a table that looks like the following:

Sample EXPLAIN Output

With SELECT, the EXPLAIN statement tells you what type of select you performed, the table that select is being performed on, the number of rows examined, and any additional information, as shown in the outputted table.

The EXPLAIN statement in this case revealed that the query used a simple select (rather than a subquery or union select) and that 298,980 rows were examined (out of a total of about 300,024 rows).

When determining why a query is slow, the number of rows examined can be useful. For example, if you notice that your output is only 13 rows, but the query is examining approximately 300,000 rows—nearly the entire table!—this could be the cause of your query's slowness.

In the earlier example, loading about 300,000 rows took less than a second to process, so that may not be a big concern with this database. However, that may not be the case with larger databases that can have up to a million rows in them.

One method of making these queries faster is by adding indexes to your table.

Indexing a Column

Think of indexes like bookmarks. Indexes point to specific rows, helping the query determine which rows match its conditions and quickly retrieves those results. With this process, the query avoids searching through the entire table and improves the performance of your query, particularly when you’re using SELECT and WHERE clauses.

There are many types of indexes that you can add to your databases, with popular ones being regular indexes, primary indexes, unique indexes, full-text indexes and prefix indexes.

Type of Index	Description
Regular Index	An index where values do not have to be unique and can be NULL.
Primary Index	Primary indexes are automatically created for primary keys. All column values are unique and NULL values are not allowed.
Unique Index	An index where all column values are unique. Unlike the primary index, unique indexes can contain a NULL value.
Full-Text Index	An index used for searching through large amounts of text and can only be created for char, varchar and/or text datatype columns.
Prefix Index	An index that uses only the first N characters of a text value, which can improve performance as only those characters would need to be searched.

Now, you might be wondering: if indexes are so great, why don’t we add them to each column?

Generally, it’s best practice to avoid adding indexes to all your columns, only adding them to the ones that it may be helpful for, such as a column that is frequently accessed. While indexing can improve the performance of some queries, it can also slow down your inserts, updates and deletes because each index will need to be updated every time. Therefore, it’s important to find the balance between the number of indexes and the speed of your queries.

In addition, indexes are less helpful for querying small tables or large tables where almost all the rows need to be examined. In the case where most rows need to be examined, it would be faster to read all those rows rather than using an index. As such, adding an index is dependent on your needs.

Be Selective with Columns

When possible, avoid selecting all columns from your table. With larger datasets, selecting all columns and displaying them can take much longer than selecting the one or two columns that you need.

For example, with a dataset of about 300,000 employee entries, the following query takes about 0.31 seconds to load:


SELECT * FROM employee;

Sample SELECT ALL Output

But if we only wanted to see the employee numbers and their hire dates (2 out of the 6 columns) we could easily do so with this query that takes 0.12 seconds to load:


SELECT employee_number, hire_date FROM employee;

Sample SELECT Output with Employee Number and Hire Date

Notice how the execution time of the query is much faster compared to the when we selected them all. This method can be helpful when dealing with large datasets that you only need select specific columns from.

Avoid Leading Wildcards

Leading wildcards, which are wildcards ("%abc") that find values that end with specific characters, result in full table scans, even with indexes in place.

If your query uses a leading wildcard and performs poorly, consider using a full-text index instead. This will improve the speed of your query while avoiding the need to search through every row.

Use the UNION ALL Clause

When using the OR operator with LIKE statements, a UNION ALL clause can improve the speed of your query, especially if the columns on both sides of the operator are indexed.

This improvement is due to the OR operator sometimes scanning the entire table and overlooking indexes, whereas the UNION ALL operator will apply them to the separate SELECT statements.

What are different types of backups and distinguish among them?

There are many different types of backups that you can use to meet your backup and restore needs. The choice of which are available to you depends on the database system that you are using. The commonly used types include.

Full back up: A full backup is a complete copy of all data in the object or objects being backed up. It simplifies the restore and recovery processes because you only need to locate the most recent full backup and restore that single file. However, as the size of your database grows, so do the time, bandwidth, and storage requirements for the backup file. Furthermore, if you decide to keep previous backup copies for safety, you will be storing many instances of a large file. If you only keep one copy, you must accept the risk that if the backup file becomes corrupt, you will be unable to restore your data. Consider that if only a subset of your data changes on a regular basis. You could be redundantly backing up the same data. Storing a complete copy of your data outside of the RDBMS requires you to ensure that it is adequately secured, and the unauthorized users cannot access it. When you restore a full back up, you return the data to the state it was in at the time the backup was taken. However, the database may have processed numerous transactions since then, which should ideally be restored as well.
Point in time backup: One solution is to enable logging for each transaction in your database, and then use the information in the log file to reapply the transactions to the restore database. The process of reapplying transactions after restoring a database backup is known as recovery, and it allows you to restore the data to the state it was in at a specific point in time, thus the name point-in-time recovery. For example, if you know that a DML statement executed at 11:05am deleted some data inadvertently, you can restore the most recent full backup and then reapply the transactions up to that point in time, minimizing the loss of data changes that occurred between the last full backup and the moment that the wrong data was deleted.
Differential backup: A differential backup is a copy of any data that has changed since the previous full backup. The differential backup file is much smaller than a full backup file, reducing backup time, bandwidth, and storage requirements while still allowing you to restore a recent copy of the data. For example, you could run a full back up once a week on Sunday, followed by a differential backup every day of the week. Each differential backup contains all of the changes that have occurred since the full back up on Sunday, so if you need to restore the database on a Tuesday, you must first restore the full back up from Sunday and then restore the differential backup from Tuesday.
You do not need to restore Monday’s differential backup, because all the changes in that file are also included in the differential backup from Tuesday.
Incremental backup: Incremental backups are similar to differential backups in that they only contain data that has changed since the last backup of any kind. For example, you could run a full backup on a Sunday once a week and then an incremental backup every day of the week. Each incremental backup only contains changes since the previous day's backup, so if you need to restore the database on a Tuesday, you must first restore the full back up from Sunday, then the incremental backup from Monday, and finally the incremental backup from Tuesday. Restoring data from full and incremental backups will take longer than simply restoring a full backup or differential backups, but the time spent performing incremental backups is likely to be less than needed for a differential backup.

Note: Different database systems use different terminology for the log containing the transaction information, for example, MySQL calls it the binary log, Postgres calls it the write-ahead log, and Db2 on Cloud calls it the transaction log. Database transaction logs keep track of all activities that change the database structure and record transactions that insert, update, or delete data in the database.

Difference between hot backups and cold backups:

Hot Backups	Cold Backups
Hot backups, or online backups, are those performed on data when it is in use.	Cold backups, or offline backups, are those performed on data when it is not in use.
The advantage of hot backups is that they have no impact on availability and users can continue with their activities throughout the backup period	Availability is impacted and users are unable to access in 24/7 environments
hot backups can result in performance degradation for users while the backup is running and can impact on data integrity if data changes during the backup process.	Data Integrety is eliminated as the backup is taken offline.
it is stored on the available servers	it is stored on external drivers or on the server which is shut down between the backup's operations.
Lesser data safety	Greater data safety

Summary:

Full backups are simple to create and restore but can be slow to run and result in large files.
Point-in-time recovery provides a more granular recovery model than just using full database backups.
Differential backups are quicker to run than full backups, but the restore process can take longer.
Incremental backups are even quicker to run, but the restore process can take even longer.
Hot, or online, backups allow data to be backed up while the database is active, whereas cold backups require the database to be offline.
The backup policy should be determined from recovery and availability needs and data usage patterns. Most managed cloud databases provide automated backup functionality with some configurable options.

Friday, March 24, 2023

Introduction to Backup and Restore techniques performed by DBA's

Backup and restore is a common phrase in database conversations and often used to refer to the process of backing up data for protection purposes-restoring it after data loss from an unplanned shutdown, accidental deletion, or data corruption. However, there are other scenarios when you might want to backup or restore your databases or objects within them.

After reading this post, you will be able to:

Describe common backup and restore scenarios,
Explain the difference between physical and logical backup,
List the objects that you can back up and restore, and
Explain what you need to consider when backing up databases and their objects.

As a data engineer/Data Ninja, you are likely to perform data backup and restore operations to transfer data from one database to another. This may be to facilitate a change of RDBMS, to share data with or load data from a business partner, or to create a copy of the data for use in another location or system, such as development or test. When backing up databases, you can perform either logical or physical backups.

Difference between Logical backups and physical backups:

	Logical Back up	Physical Back up
Content	Creates a file containing DDL (such as create table) and DML commands (such as insert) that recreate the objects and data in the database. As such, you can use this file to recreate the database on the same or another system.	Creates a copy of all the physical storage files and directories that belong to a table, database, or other objects, including the data files, configuration files, and log files to aid point-in-time recovery.
Space	when you perform a logical backup and restore, you reclaim any wasted space from the original database because the restore process creates a clean version of the tables.	You can't reclaim the wasted space.
Performance	Generating logical backups can take a long time for large databases and may impact the performance of other queries that are concurrently running.	Physical backups are often smaller and quicker than logical backups; they are useful for large or important databases that require fast recovery times. They are similar to backing up any other types of files on your physical system.
Granular	Logical backups enable you to backup granular objects. For example, you can back up an individual database or table; however, you cannot use it to backup log files or database configuration settings. You typically use import, export, dump, and load utilities to perform logical backups.	Physical Backup is less granular objects. For Ex, you won't be able to backup an individual database or table.

You can choose exactly which parts of a database you want to backup. Depending on the RDBMS you are using and type of backup you are performing, you can back up a whole database, the contents of a schema, one or more tables from a database, a subset of data from one or more tables in a database, or a collection of other objects in the database.

It is essential to check these when using backup and restore as part of your disaster recovery plans, as an invalid back up or an inability to restore can result in data loss. You should also ensure that you secure the transfer and storage location of your back up files at the same level that you secure the data in your database.

When performing backups, some RDBMSs support additional options that you can use: You may be able to configure a compression level for the backup files. Compressing the files will reduce the output file size which can be useful for large databases or if you are backing up to a remote location; however, it comes at a cost of time taken to perform the backup and the restore procedures. You may also be able to encrypt the backup files, to reduce the risk of any data being compromised. But again, this will increase the time taken for the backup and restore.

Summary:

In this post, you understood that: You can use backup and restore for data recovery and other purposes. Physical backups create a copy of the raw database storage files and directories whereas logical backups extract the data from a database and save it in a special format. You can backup whole databases or objects within them. You should always check that your backup is safe and usable and that your restore plan works. You can use backup options to compress or encrypt your files.