Wednesday, October 31, 2012
12c Grid Control New Features
1. Database Creation Via Enterprise Manager Cloud Control
2. Database Upgrade Via Enterprise Manager Cloud Control
3. Database Cloning Enhancements
4. Oracle Exadata Server Management Enhancements
5. Manage Automatic Storage Management (ASM) Clusters as a Target
6. Database Configuration Compliance Standards Support
7. Emergency Performance
8. Database Backup and Restore Enhancements
9. Database System Discovery Enhancements
10. Change Plans Added to Change Management Pack
11. Compare Period Advisor
12. Compare Functionality
13. Active Reports
14. Real Application Testing and Data Masking Integration
15. Application Templates for Data Masking and Data Subsetting
16. Data Subsetting
17. Application Data Model Support for Data Masking
18. Reversible Data Masking
19. Performance Diagnostics Enhancements
20. Streams and XStreams Support
-----------------------------------------------------------------------------------------------------------
1. Database Creation Via Enterprise Manager Cloud Control
A wizard allows you to create an Oracle Database from within Enterprise Manager Cloud Control. You can create different configurations, including Single Instance and Real Application Clusters (RAC) databases, as well as file system and Automatic Storage Management (ASM) storage options.
2. Database Upgrade Via Enterprise Manager Cloud Control
You can now upgrade Single Instance and Real Application Clusters (RAC) Oracle databases through Cloud Control. This feature makes it possible to manage upgrades from a single console. You no longer have to access each individual database machine to perform upgrades.
3. Database Cloning Enhancements
Cloning procedures have been enhanced to capture configuration properties as well as the software payload. This is particularly useful when migrating databases from development to test to production or the reverse. A new EMCLI verb (clone_database) allows for database cloning using the same job type as the existing Clone Database feature of Cloud Control.
4. Oracle Exadata Server Management Enhancements
Oracle Exadata management capabilities now provide support for full target lifecycle management:
- Automatically discover Exadata targets
- Automatically create an Enterprise Manager System for end-to-end monitoring
- Provide extensive administration capabilities for databases, Exadata cells and Infiniband switches
- Simplify performance diagnostics with the help of in-depth performance charts covering all Exadata components
5. Manage Automatic Storage Management (ASM) Clusters as a Target
You can now manage clustered ASM resources as a single target, instead of each individual ASM instance having to be managed separately. Incident management and metric monitoring can be managed for the entire cluster.
6. Database Configuration Compliance Standards Support
Oracle database configuration data can now be managed within the new configuration and compliance standards frameworks.
7. Emergency Performance
This feature allows a DBA to diagnose and troubleshoot a hung or extremely slow database using the memory access mode. This mode is designed to bypass the SQL information retrieval layer and read performance statistics directly from the SGA of the target.
8. Database Backup and Restore Enhancements
You can now use Enterprise Manager Cloud Control to centrally maintain the settings for Oracle database and file system backups. This enhancement enables you to apply uniform settings to multiple databases and file systems when backing up multiple objects—all in one operation. Users can perform a backup on multiple databases concurrently or sequentially in one deployment procedure. An Oracle Home OSB tape backup can be restored either to the original or to a different location, and the restored Oracle Home can be reconfigured to function in the newly restored location.
9. Database System Discovery Enhancements
As the database system is now built upon the new target and association model, you can use it to monitor and manage a database’s storage, connectivity, and high availability. This also enables you to easily monitor and manage the applications that are dependent on the database. The database discovery functionality is enhanced to work with the new discovery framework and to provide a reliable workflow to create a database system.
10. Change Plans Added to Change Management Pack
As part of the Oracle Change Management Pack, the new Change Plans function allows application developers and database administrators to encapsulate schema changes needed to be made to a database into a “change plan,” which can be used to document, capture, and apply schema changes. Change Plans are also integrated with developer and DBA tasks into SQL*Developer and Oracle Enterprise Manager task automation. This integration reduces the manual processes between the various stakeholders involved in the process of promoting planned changes across enterprise databases while ensuring the integrity of the process.
11. Compare Period Advisor
This feature compares the performance of a database over two different time ranges. It analyzes changes in performance, workload, configuration, and hardware to highlight changes between the two time periods. The Compare Period Advisor gives the DBA the ability to compare two arbitrary periods of time.
12. Compare Functionality
The Compare functionality has been enhanced with new capabilities such as template support, system level comparison, and change notification. Users can now selectively include or ignore types of differences. Output of a comparison can easily be saved and exported, both in printable (for example, plain text) and data-centric (for example, CSV) formats. Users can select comparison start and end dates and view a history of changes for composite targets.
13. Active Reports
A new Active Reports function allows users to save performance data into an HTML file. Once saved, the report can be used for offline analysis or sent to other users, including Oracle Support. Active Reports enhances the visual representation of performance data and facilitates the convenient exchange of complex data.
14. Real Application Testing and Data Masking Integration
Real Application Testing and Data Masking integration provides users with the ability to perform secure testing in situations where data in production needs to be shared by nonproduction users due to organization or business requirements. Typically testing is done in a nonproduction environment or by a different group or organization. This integration addresses a common requirement that the data used for testing be shared in a manner that adheres to data privacy and compliance regulations.
15. Application Templates for Data Masking and Data Subsetting
This feature provides predefined data masking and data subsetting templates for applications. It allows users to automatically create test systems based on best practices recommendations.
16. Data Subsetting
Data subsetting provides the ability to create a smaller sized copy of the original production data that can be given to developers for testing. While it is a data subset, the referential relationships are preserved so that the data set is complete. This allows enterprises to lower storage costs while making production data available to developers for testing, without having to incur the storage footprint of the entire production database.
17. Application Data Model Support for Data Masking
The application data model (ADM) now stores the sensitive data elements used to generate mask definitions dynamically. Instead of having to manually discover sensitive data, the application data model identifies and stores the sensitive data elements.
18. Reversible Data Masking
Using encryption and decryption algorithms, reversible masking allows encryption of a user’s data deterministically into a format chosen by the user as a regular expression. Unmasking reverses the process to revert back to the original data. This feature is useful in environments where sensitive data needs to be masked and sent to a third party for processing. Coupling integrated masking with the application data model (ADM), an application’s data model is now available for certain packaged applications and can serve as a knowledge base containing sensitive column and data relationships.
19. Performance Diagnostics Enhancements
With the interactive user interface in the Active Session History (ASH) Viewer, users now can visualize the many performance dimensions that were not available to them in earlier releases. The Enhanced Enterprise Manager Performance and Top Activity pages allow users to visualize the multidimensional data in ASH. The ASH viewer enhances the performance troubleshooting capabilities of a DBA by providing the facility to detect skews in workload. Emergency ADDM adds performance diagnostics for databases suffering from severe performance problems.
20. Streams and XStreams Support
Streams and XStreams configurations can now be managed and monitored using Cloud Control. In addition to improvements in configuration and performance monitoring screens, logical change record (LCR) tracking is available for high-level diagnosis of replication issues. Cloud Control also simplifies the management and monitoring of replicated environments.
Tuesday, October 30, 2012
Shut down Oracle Database
What Shut down option is best based on the Situation in Oracle
When the NORMAL|IMMEDIATE mode doesn't work, as a last resort, we use the 'SHUTDOWN ABORT' to terminate an active instance as quickly as possible, leaving a database inconsistent mode. Of course, the subsequent database startup requires an instance recovery and the recovery will be performed by the SMON b/g process automatically. Having said that, sometimes, this mode would have a huge risk of data corruption, specifically in pre 8.1.6 version.
Beyond a doubt, the 'SHUTDOWN ABORT' is the fastest mode of a database shutdown. Nevertheless, we sometime afraid of using this mode due to the above mentioned facts. In order make a decision between a clean shutdown and shutdown about, one can do the following exercise:
Determine the of rollback is required (in bytes) for a clean database shutdown:
select sum(used_ublk) *
from v$transaction;
If the amount of rollback required for a clean shutdown is very little, then go ahead with 'SHUTDOWN IMMEDIATE'. In case if the amount of rollback required for a clean shutdown is huge and time consuming, use the 'SHUTDOWN ABORT' command, preferably if you are on >8.1.6 version.
Bring up the database in RESTRICT MODE subsequently to verify the rollback progress:
select sum(distinct(ktuxesiz))
from x$ktuxe where ktuxecfl = 'DEAD';
Upon rollback completion, shutdown the database cleanly, using the 'SHUTDOWN IMMEDIATE'.
Reference:
What Is The Fastest Way To Cleanly Shutdown An Oracle Database? [ID 386408.1]
Sunday, September 30, 2012
if condition options in shell scripting
if condition options in shell scripting
| Primary | Meaning |
|---|---|
| [ -a FILE ] | True if FILE exists. |
| [ -b FILE ] | True if FILE exists and is a block-special file. |
| [ -c FILE ] | True if FILE exists and is a character-special file. |
| [ -d FILE ] | True if FILE exists and is a directory. |
| [ -e FILE ] | True if FILE exists. |
| [ -f FILE ] | True if FILE exists and is a regular file. |
| [ -g FILE ] | True if FILE exists and its SGID bit is set. |
| [ -h FILE ] | True if FILE exists and is a symbolic link. |
| [ -k FILE ] | True if FILE exists and its sticky bit is set. |
| [ -p FILE ] | True if FILE exists and is a named pipe (FIFO). |
| [ -r FILE ] | True if FILE exists and is readable. |
| [ -s FILE ] | True if FILE exists and has a size greater than zero. |
| [ -t FD ] | True if file descriptor FD is open and refers to a terminal. |
| [ -u FILE ] | True if FILE exists and its SUID (set user ID) bit is set. |
| [ -w FILE ] | True if FILE exists and is writable. |
| [ -x FILE ] | True if FILE exists and is executable. |
| [ -O FILE ] | True if FILE exists and is owned by the effective user ID. |
| [ -G FILE ] | True if FILE exists and is owned by the effective group ID. |
| [ -L FILE ] | True if FILE exists and is a symbolic link. |
| [ -N FILE ] | True if FILE exists and has been modified since it was last read. |
| [ -S FILE ] | True if FILE exists and is a socket. |
| [ FILE1 -nt FILE2] | True if FILE1 has been changed more recently than FILE2, or if FILE1 exists and FILE2 does not. |
| [ FILE1 -ot FILE2] | True if FILE1 is older than FILE2, or is FILE2 exists and FILE1 does not. |
| [ FILE1 -ef FILE2] | True if FILE1 and FILE2 refer to the same device and inode numbers. |
| [ -o OPTIONNAME ] | True if shell option "OPTIONNAME" is enabled. |
| [ -z STRING ] | True of the length if "STRING" is zero. |
| [ -n STRING ] or [ STRING ] | True if the length of "STRING" is non-zero. |
| [ STRING1 == STRING2 ] | True if the strings are equal. "=" may be used instead of "==" for strict POSIX compliance. |
| [ STRING1 != STRING2 ] | True if the strings are not equal. |
| [ STRING1 < STRING2 ] | True if "STRING1" sorts before "STRING2" lexicographically in the current locale. |
| [ STRING1 > STRING2 ] | True if "STRING1" sorts after "STRING2" lexicographically in the current locale. |
| [ ARG1 OP ARG2 ] | "OP" is one of -eq, -ne, -lt, -le, -gt or -ge. These arithmetic binary operators return true if "ARG1" is equal to, not equal to, less than, less than or equal to, greater than, or greater than or equal to "ARG2", respectively. "ARG1" and "ARG2" are integers. |
Thursday, September 27, 2012
How to Copy virtual machines into ESXi using the vSphere Client
How to copy vmdk to esxi
- In the vSphere client, click on the server.
- Click on the summary tab.
- Right click on datastore -> Browse Datastore
- Use the icon with the up arrow in front of some disks (Upload files to this datastore)
- Upload the folder or files that you require.
- Create a new (or use an existing) VM and use the vmdk file you have uploaded.
Saturday, August 18, 2012
SSH Automatic Login
from source
1)ssh-keygen -t dsa
2)cat /oracle1/oraprod/.ssh/id_dsa.pub
ssh -l oraprod 192.168.1.9 'cat >> /oracle1/oraprod/.ssh/authorized_keys'
Thursday, July 5, 2012
Increasing Swap Space on Linux
Check the memory on your server
[root@host root] # free -m
Now say you need to increase it by 500 MB for your server, first locate a place you can spare this 500 MB in my case i found it in /stage
Use the dd command to create a swapfile
#cd /u01
smtp.ap.airtelbroadband.in-out
bop.ap.airtelbroadband.in-in
# dd if=/dev/zero of=swapfile bs=1024 count=512000
512000+0 records in 512000+0 records out
# ls -ltr drwx------ 2 root root 16384 May 1 2006 lost+found -rw-r--r-- 1 root root 524288000 Nov 28 13:58 swapfile
Next issue the following two commands
# mkswap swapfile
Setting up swapspace version 1, size = 524283 kB
# swapon swapfile
Now check you memory again
# free -m
Bingo! here is your increased SWAP.
Tuesday, June 19, 2012
select any table privilege
•If you have O7_DICTIONARY_ACCESSIBILITY=TRUE then SELECT ANY TABLE privilege provides access to all SYS and non-SYS objects.
•If you have O7_DICTIONARY_ACCESSIBILITY=FALSE then SELECT ANY TABLE privilege provides access only to non-SYS objects.
•If only SELECT_CATALOG_ROLE is enabled then it provides access to all SYS views only.
•If only SELECT ANY DICTIONARY privilege is enabled then it provides access to SYS schema objects only.
•If both SELECT ANY TABLE and SELECT any DICTIONARY privilege is enabled then it allow access to all SYS and non-SYS objects.
•SELECT ANY DICTIONARY privilege and SELECT_CATALOG_ROLE has no affect over O7_DICTIONARY_ACCESSIBILITY settings.
How to select special character(like _ ) tables in Oracle
Example SQL query :
select table_name from dba_tables where owner='ownername' AND table_name like '%\_%' escape '\'
One more example
where ksppinm like '%\_io\_%' ESCAPE '\';
Oracle Escape Characters
Oracle allows the assignment of special escape characters to tell Oracle that the character is interpreted literally. Certain characters such as the underscore “_” are not interpreted literally because they have special meaning within Oracle.
So we have to use escape command for selecting the special character tables like _
example for escape command
Example
select table_name from dba_tables where owner='ownername' AND table_name like '%\_%'escape '\'
One more example
where ksppinm like '%\_io\_%' ESCAPE '\';
Tuesday, June 12, 2012
apply cpu patch
cpu patch location on the Server
/opt/oracle/July2010/9655017
select * from registry$history;
select * from v$version;
select * from dba_registry_history;
select count(1) from dba_objects where status like 'I%';
SELECT OBJECT_NAME,OBJECT_TYPE,owner FROM DBA_OBJECTS WHERE STATUS= 'INVALID';
cd $ORACLE_HOME/OPatch
opatch version
opatch lsinventory
4)use below commands and take the home and inventory backups.
cd /oracle10g/PRDRCD1/product/10.2
tar cvf - .|gzip -c > /oradb/PRDRCD1/oradata1/back_home/home_prdrcd1_`hostname`_`date +%Y%m%d`.tar.gz
cd /oracle10g/PRDRCD1/product/10.2/inventory
tar cvf - .|gzip -c > /oradb/PRDRCD1/oradata1/back_home/oraInvent_prdrcd1_`hostname`_`date +%Y%m%d`.tar.gz
cd /oracle10g/oraInventory
tar cvf - .|gzip -c > /oradb/PRDRCD1/oradata1/back_home/oracle10g_oraInventory_`hostname`_`date +%Y%m%d`.tar.gz
5)take the OPatch backup.
cd $ORACLCE_HOME
cp OPatch OPatch_bak
6) unzip the p6880880_102000_SOLARIS64.zip under ORACLE_HOME
export PATH=$PATH:/usr/ccs/bin
export PATH=$ORACLE_HOME/OPatch:$PATH:.
opatch version
opatch napply -skip_subset -skip_duplicate
8) Run catbunle.sql
cd $ORACLE_HOME/rdbms/admin
sqlplus /'as sysdba'
startup
@catbundle.sql cpu apply
cd $ORACLE_HOME/cpu/view_recompile
sqlplus /'as sysdba'
@recompile_precheck_jan2008cpu.sql
sql>shut immediate
startup upgrade
@view_recompile_jan2008cpu.sql
shut immediate
startup
@utlrp.sql
Friday, August 26, 2011
RAC Architecture
RAC Architecture
Oracle Real Application clusters allows multiple instances to access a single database, the instances will be running on multiple nodes. In an standard Oracle configuration a database can only be mounted by one instance but in a RAC environment many instances can access a single database.

Oracle's RAC is heavy dependent on a efficient, high reliable high speed private network called the interconnect, make sure when designing a RAC system that you get the best that you can afford.
The table below describes the difference of a standard oracle database (single instance) an a RAC environment
| Component | Single Instance Environment | RAC Environment |
| SGA | Instance has its own SGA | Each instance has its own SGA |
| Background processes | Instance has its own set of background processes | Each instance has its own set of background processes |
| Datafiles | Accessed by only one instance | Shared by all instances (shared storage) |
| Control Files | Accessed by only one instance | Shared by all instances (shared storage) |
| Online Redo Logfile | Dedicated for write/read to only one instance | Only one instance can write but other instances can read during recovery and archiving. If an instance is shutdown, log switches by other instances can force the idle instance redo logs to be archived |
| Archived Redo Logfile | Dedicated to the instance | Private to the instance but other instances will need access to all required archive logs during media recovery |
| Flash Recovery Log | Accessed by only one instance | Shared by all instances (shared storage) |
| Alert Log and Trace Files | Dedicated to the instance | Private to each instance, other instances never read or write to those files. |
| ORACLE_HOME | Multiple instances on the same server accessing different databases ca use the same executable files | Same as single instance plus can be placed on shared file system allowing a common ORACLE_HOME for all instances in a RAC environment. |
The major components of a Oracle RAC system are
- Shared disk system
- Oracle Clusterware
- Cluster Interconnects
- Oracle Kernel Components
The below diagram describes the basic architecture of the Oracle RAC environment

Here are a list of processes running on a freshly installed RAC

With today's SAN and NAS disk storage systems, sharing storage is fairly easy and is required for a RAC environment, you can use the below storage setups
- SAN (Storage Area Networks) - generally using fibre to connect to the SAN
- NAS ( Network Attached Storage) - generally using a network to connect to the NAS using either NFS, ISCSI
- JBOD - direct attached storage, the old traditional way and still used by many companies as a cheap option
All of the above solutions can offer multi-pathing to reduce SPOFs within the RAC environment, there is no reason not to configure multi-pathing as the cost is cheap when adding additional paths to the disk because most of the expense is paid when out when configuring the first path, so an additional controller card and network/fibre cables is all that is need.
The last thing to think about is how to setup the underlining disk structure this is known as a raid level, there are about 12 different raid levels that I know off, here are the most common ones
| raid 0 (Striping) | A number of disks are concatenated together to give the appearance of one very large disk. Advantages
Disadvantages
|
| raid 1 (Mirroring) | A single disk is mirrored by another disk, if one disk fails the system is unaffected as it can use its mirror. Advantages
|
| raid 5 | Raid stands for Redundant Array of Inexpensive Disks, the disks are striped with parity across 3 or more disks, the parity is used in the event that one of the disks fails, the data on the failed disk is reconstructed by using the parity bit. Advantages
|
There are many other raid levels that can be used with a particular hardware environment for example EMC storage uses the RAID-S, HP storage uses Auto RAID, so check with the manufacture for the best solution that will provide you with the best performance and resilience.
Once you have you storage attached to the servers, you have three choices on how to setup the disks
- Raw Volumes - normally used for performance benefits, however they are hard to manage and backup
- Cluster FileSystem - used to hold all the Oracle datafiles can be used by windows and linux, its not used widely
- Automatic Storage Management (ASM) - Oracle choice of storage management, its a portable, dedicated and optimized cluster filesystem
I will only be discussing ASM, which i have already have a topic on called Automatic Storage Management.
Oracle Clusterware software is designed to run Oracle in a cluster mode, it can support you to 64 nodes, it can even be used with a vendor cluster like Sun Cluster.
The Clusterware software allows nodes to communicate with each other and forms the cluster that makes the nodes work as a single logical server. The software is run by the Cluster Ready Services (CRS) using the Oracle Cluster Registry (OCR) that records and maintains the cluster and node membership information and the voting disk which acts as a tiebreaker during communication failures. Consistent heartbeat information travels across the interconnect to the voting disk when the cluster is running.
The CRS has four components
- OPROCd - Process Monitor Daemon
- CRSd - CRS daemon, the failure of this daemon results in a node being reboot to avoid data corruption
- OCSSd - Oracle Cluster Synchronization Service Daemon (updates the registry)
- EVMd - Event Volume Manager Daemon
The OPROCd daemon provides the I/O fencing for the Oracle cluster, it uses the hangcheck timer or watchdog timer for the cluster integrity. It is locked into memory and runs as a realtime processes, failure of this daemon results in the node being rebooted. Fencing is used to protect the data, if a node were to have problems fencing presumes the worst and protects the data thus restarts the node in question, its better to be save than sorry.
The CRSd process manages resources such as starting and stopping the services and failover of the application resources, it also spawns separate processes to manage application resources. CRS manages the OCR and stores the current know state of the cluster, it requires a public, private and VIP interface in order to run. OCSSd provides synchronization services among nodes, it provides access to the node membership and enables basic cluster services, including cluster group services and locking, failure of this daemon causes the node to be rebooted to avoid split-brain situations.
The below functions are covered by the OCSSd
- CSS provides basic Group Services Support, it is a distributed group membership system that allows applications to coordinate activities to archive a common result.
- Group services use vendor clusterware group services when it is available.
- Lock services provide the basic cluster-wide serialization locking functions, it uses the First In, First Out (FIFO) mechanism to manage locking
- Node services uses OCR to store data and updates the information during reconfiguration, it also manages the OCR data which is static otherwise.
The last component is the Event Management Logger, which runs the EVMd process. The daemon spawns a processes called evmlogger and generates the events when things happen. The evmlogger spawns new children processes on demand and scans the callout directory to invoke callouts. Death of the EVMd daemon will not halt the instance and will be restarted.
Quick recap
| CRS Process | Functionality | Failure of the Process | Run AS |
| OPROCd - Process Monitor | provides basic cluster integrity services | Node Restart | root |
| EVMd - Event Management | spawns a child process event logger and generates callouts | Daemon automatically restarted, no node restart | oracle |
| OCSSd - Cluster Synchronization Services | basic node membership, group services, basic locking | Node Restart | oracle |
| CRSd - Cluster Ready Services | resource monitoring, failover and node recovery | Daemon restarted automatically, no node restart | root |
The cluster-ready services (CRS) is a new component in 10g RAC, its is installed in a separate home directory called ORACLE_CRS_HOME. It is a mandatory component but can be used with a third party cluster (Veritas, Sun Cluster), by default it manages the node membership functionality along with managing regular RAC-related resources and services
RAC uses a membership scheme, thus any node wanting to join the cluster as to become a member. RAC can evict any member that it seems as a problem, its primary concern is protecting the data. You can add and remove nodes from the cluster and the membership increases or decrease, when network problems occur membership becomes the deciding factor on which part stays as the cluster and what nodes get evicted, the use of a voting disk is used which I will talk about later.
The resource management framework manage the resources to the cluster (disks, volumes), thus you can have only have one resource management framework per resource. Multiple frameworks are not supported as it can lead to undesirable affects.
The Oracle Cluster Ready Services (CRS) uses the registry to keep the cluster configuration, it should reside on a shared storage and accessible to all nodes within the cluster. This shared storage is known as the Oracle Cluster Registry (OCR) and its a major part of the cluster, it is automatically backed up (every 4 hours) the daemons plus you can manually back it up. The OCSSd uses the OCR extensively and writes the changes to the registry
The OCR keeps details of all resources and services, it stores name and value pairs of information such as resources that are used to manage the resource equivalents by the CRS stack. Resources with the CRS stack are components that are managed by CRS and have the information on the good/bad state and the callout scripts. The OCR is also used to supply bootstrap information ports, nodes, etc, it is a binary file.
The OCR is loaded as cache on each node, each node will update the cache then only one node is allowed to write the cache to the OCR file, the node is called the master. The Enterprise manager also uses the OCR cache, it should be at least 100MB in size. The CRS daemon will update the OCR about status of the nodes in the cluster during reconfigurations and failures.
The voting disk (or quorum disk) is shared by all nodes within the cluster, information about the cluster is constantly being written to the disk, this is know as the heartbeat. If for any reason a node cannot access the voting disk it is immediately evicted from the cluster, this protects the cluster from split-brains (the Instance Membership Recovery algorithm IMR is used to detect and resolve split-brains) as the voting disk decides what part is the really cluster. The voting disk manages the cluster membership and arbitrates the cluster ownership during communication failures between nodes. Voting is often confused with quorum the are similar but distinct, below details what each means
| Voting | A vote is usually a formal expression of opinion or will in response to a proposed decision |
| Quorum | is defined as the number, usually a majority of members of a body, that, when assembled is legally competent to transact business |
The only vote that counts is the quorum member vote, the quorum member vote defines the cluster. If a node or group of nodes cannot archive a quorum, they should not start any services because they risk conflicting with an established quorum.
The voting disk has to reside on shared storage, it is a a small file (20MB) that can be accessed by all nodes in the cluster. In Oracle 10g R1 you can have only one voting disk, but in R2 you can have upto 32 voting disks allowing you to eliminate any SPOF's.
The original Virtual IP in Oracle was Transparent Application Failover (TAF), this had limitations, this has now been replaced with cluster VIPs. The cluster VIPs will failover to working nodes if a node should fail, these public IPs are configured in DNS so that users can access them. The cluster VIPs are different from the cluster interconnect IP address and are only used to access the database.
The cluster interconnect is used to synchronize the resources of the RAC cluster, and also used to transfer some data from one instance to another. This interconnect should be private, highly available and fast with low latency, ideally they should be on a minimum private 1GB network. What ever hardware you are using the NIC should use multi-pathing (Linux - bonding, Solaris - IPMP). You can use crossover cables in a QA/DEV environment but it is not supported in a production environment, also crossover cables limit you to a two node cluster.
The kernel components relate to the background processes, buffer cache and shared pool and managing the resources without conflicts and corruptions requires special handling.
In RAC as more than one instance is accessing the resource, the instances require better coordination at the resource management level. Each node will have its own set of buffers but will be able to request and receive data blocks currently held in another instance's cache. The management of data sharing and exchange is done by the Global Cache Services (GCS).
All the resources in the cluster group form a central repository called the Global Resource Directory (GRD), which is distributed. Each instance masters some set of resources and together all instances form the GRD. The resources are equally distributed among the nodes based on their weight. The GRD is managed by two services called Global Caches Services (GCS) and Global Enqueue Services (GES), together they form and manage the GRD. When a node leaves the cluster, the GRD portion of that instance needs to be redistributed to the surviving nodes, a similar action is performed when a new node joins.
Each node has its own background processes and memory structures, there are additional processes than the norm to manage the shared resources, theses additional processes maintain cache coherency across the nodes.
Cache coherency is the technique of keeping multiple copies of a buffer consistent between different Oracle instances on different nodes. Global cache management ensures that access to a master copy of a data block in one buffer cache is coordinated with the copy of the block in another buffer cache.
The sequence of a operation would go as below
- When instance A needs a block of data to modify, it reads the bock from disk, before reading it must inform the GCS (DLM). GCS keeps track of the lock status of the data block by keeping an exclusive lock on it on behalf of instance A
- Now instance B wants to modify that same data block, it to must inform GCS, GCS will then request instance A to release the lock, thus GCS ensures that instance B gets the latest version of the data block (including instance A modifications) and then exclusively locks it on instance B behalf.
- At any one point in time, only one instance has the current copy of the block, thus keeping the integrity of the block.
GCS maintains data coherency and coordination by keeping track of all lock status of each block that can be read/written to by any nodes in the RAC. GCS is an in memory database that contains information about current locks on blocks and instances waiting to acquire locks. This is known as Parallel Cache Management (PCM). The Global Resource Manager (GRM) helps to coordinate and communicate the lock requests from Oracle processes between instances in the RAC. Each instance has a buffer cache in its SGA, to ensure that each RAC instance obtains the block that it needs to satisfy a query or transaction. RAC uses two processes the GCS and GES which maintain records of lock status of each data file and each cached block using a GRD.
So what is a resource, it is an identifiable entity, it basically has a name or a reference, it can be a area in memory, a disk file or an abstract entity. A resource can be owned or locked in various states (exclusive or shared). Any shared resource is lockable and if it is not shared no access conflict will occur.
A global resource is a resource that is visible to all the nodes within the cluster. Data buffer cache blocks are the most obvious and most heavily global resource, transaction enqueue's and database data structures are other examples. GCS handle data buffer cache blocks and GES handle all the non-data block resources.
All caches in the SGA are either global or local, dictionary and buffer caches are global, large and java pool buffer caches are local. Cache fusion is used to read the data buffer cache from another instance instead of getting the block from disk, thus cache fusion moves current copies of data blocks between instances (hence why you need a fast private network), GCS manages the block transfers between the instances.
Finally we get to the processes
| Oracle RAC Daemons and Processes | ||
| LMSn | Lock Manager Server process - GCS | this is the cache fusion part and the most active process, it handles the consistent copies of blocks that are transferred between instances. It receives requests from LMD to perform lock requests. I rolls back any uncommitted transactions. There can be up to ten LMS processes running and can be started dynamically if demand requires it. they manage lock manager service requests for GCS resources and send them to a service queue to be handled by the LMSn process. It also handles global deadlock detection and monitors for lock conversion timeouts. as a performance gain you can increase this process priority to make sure CPU starvation does not occur you can see the statistics of this daemon by looking at the view X$KJMSDP |
| LMON | Lock Monitor Process - GES | this process manages the GES, it maintains consistency of GCS memory structure in case of process death. It is also responsible for cluster reconfiguration and locks reconfiguration (node joining or leaving), it checks for instance deaths and listens for local messaging. A detailed log file is created that tracks any reconfigurations that have happened. |
| LMD | Lock Manager Daemon - GES | this manages the enqueue manager service requests for the GCS. It also handles deadlock detention and remote resource requests from other instances. you can see the statistics of this daemon by looking at the view X$KJMDDP |
| LCK0 | Lock Process - GES | manages instance resource requests and cross-instance call operations for shared resources. It builds a list of invalid lock elements and validates lock elements during recovery. |
| DIAG | Diagnostic Daemon | This is a lightweight process, it uses the DIAG framework to monitor the health of the cluster. It captures information for later diagnosis in the event of failures. It will perform any necessary recovery if an operational hang is detected. |
|
|
|
|
Thursday, August 18, 2011
RAC Interview qutions
What are Oracle Clusterware processes for 10g on Unix and Linux
Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.
Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user
Event manager daemon (evmd) —A background process that publishes events that crs creates.
Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.
RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.
What are Oracle database background processes specific to RAC
•LMS—Global Cache Service Process
•LMD—Global Enqueue Service Daemon
•LMON—Global Enqueue Service Monitor
•LCK0—Instance Enqueue Process
To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances.
What are Oracle Clusterware Components
Voting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.
Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster
How do you troubleshoot node reboot
Please check metalink ...
Note 265769.1 Troubleshooting CRS Reboots
Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions.
How do you backup the OCR
There is an automatic backup mechanism for OCR. The default location is : $ORA_CRS_HOME\cdata\"clustername"\
To display backups :
#ocrconfig -showbackup
To restore a backup :
#ocrconfig -restore
With Oracle RAC 10g Release 2 or later, you can also use the export command:
#ocrconfig -export -s online, and use -import option to restore the contents back.
With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR with the command:
# ocrconfig -manualbackup
How do you backup voting disk
#dd if=voting_disk_name of=backup_file_name
How do I identify the voting disk location
#crsctl query css votedisk
How do I identify the OCR file location
check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)
or
#ocrcheck
Is ssh required for normal Oracle RAC operation ?
"ssh" are not required for normal Oracle RAC operation. However "ssh" should be enabled for Oracle RAC and patchset installation.
What is SCAN?
Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g Release 2 feature that provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster.
Click here for more details from Oracle
What is the purpose of Private Interconnect ?
Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster.
Why do we have a Virtual IP (VIP) in Oracle RAC?
Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs.
When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.
What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR Report?
This is most likely due to a fault in interconnect network.
Check netstat -s
if you see "fragments dropped" or "packet reassemblies failed" , Work with your system administrator find the fault with network.
How many nodes are supported in a RAC Database?
10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a RAC database.
Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus can start it on both nodes? How do you identify the problem?
Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now you will get detailed error stack.
what is the purpose of the ONS daemon?
The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service daemon receive a subset of published clusterware events via the local evmd and racgimon clusterware daemons and forward those events to application subscribers and to the local listeners.
This in order to facilitate:
a. the FAN or Fast Application Notification feature or allowing applications to respond to database state changes.
b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing accross different rac nodes dependent of the load on the different nodes. The rdbms MMON is creating an advisory for distribution of work every 30seconds and forward it via racgimon and ONS to listeners and applications.