Thursday 17 November 2011

Export & Import Category


Archive for the ‘export/import’ Category
Many a times, we may observe that datapump is running slow even after using PARALLEL option. But, we don’t know that there is a method to calculate value for PARALLEL parameter. Below information will helps us to do that…
DATAPUMP will use two methods to export/import data
a. DIRECT PATH
2. EXTERNAL TABLES
It is upto datapump to decide which path it can work. Means, it may happen that some tables are exported/imported through direct path and some other using external tables in the same datapump job.
1. Value for PARALLEL parameter in datapump can be set to more than one only in Enterprise Edition.
2. PARALLEL will not work effectively on Jobs with more metadata
3. As we know, in datapump, a master process will control entire process of export/import through worker processes. These worker process will start parallel execution (PX) processes to do that actual work. We can increase or decrease parallelism at any moment using interactive prompt.
4. If the worker processes are active, then even though we decrease the value for parallel, it will not be affective till the work reaches a completion point. But, increase in parallel processes will take affect immediately.
5. For Data Pump Export, the value that is specified for the parallel parameter should be less than or equal to the number of files in the dump file set (if you specified dump files exclusively and not used %U option)
HOW EXPORT WITH PARALLEL WORKS
1. Master process will start multiple worker processes. Atleast 2 worker processes will be created in case of export, one is for metadata and other is for table data.
2. You might have observed that datapump export will give estimated data size (even if we don’t mention ESTIMATE option). This is to calculate the number of parallel execution (PX) processes.
3. The columns of the tables are observed next, to decide whether to go for direct path or external tables method
Note: Direct path doesn’t support PARALLEL more than one
4. If the external tables method is chosen, Data Pump will determine the maximum number of PX processes that can work on a table data. It does this by dividing the estimated size of the table data by 250 MB and rounding the result down. If the result is zero or one, then PX processes are not used
Example : If the data size is 1000MB, it will be divided by 250MB which results in 4 i.e one process is for metadata and 1 process for data. again the worker process of data will have 4 corresponding PX processes. So, the total number of processes are 6.
Note : PX proceeses information will not be shown when we check through STATUS command
So, in the above example, we will see only 2 processes instead of 6 if we use STATUS command
5. Even though we give more PARALLEL value, Oracle will calculate worker and PX processes as above only.
Example : if we give PARALLEL=10 in above example, still Oracle uses 6 only
Note : We need to remember that Oracle will not scale up the processes if we mention less value to PARALLEL. So, we need to give more value any time in order to get maximum benefit

HOW IMPORT WITH PARALLEL WORKS
The PARALLEL parameter works a bit differently in Import because there are various dependencies and everything must be done in order.
Data Pump Import processes the database objects in the following order:
1. The first worker begins to load all the metadata: the tablespaces, schemas, etc., until all the tables are created.
2. Once the tables are created, the first worker starts loading data and the rest of the workers start loading data
3. Once the table data is loaded, the first worker returns to loading metadata again i.e for Indexes or other objects. The rest of the workers are idle until the first worker loads all the metadata
Note: One worker creates all the indexes but uses PX processes up to the PARALLEL value so indexes get created faster.
Thus, an import job can be started with a PARALLEL = 10, and the user will only see one worker being utilized at certain points during job execution. No other workers or Parallel Execution Processes will be working until all the tables are created. When the tables are created, a burst of workers and possibly PX processes will execute in parallel until the data is loaded, then the worker processes will become idle.
Before starting any export/import, it is better to use ESTIMATE_ONLY parameter. Divide the output by 250MB and based on the result decide on PARALLEL value
Finally when using PARALLEL option, do keep below points in mind
a. Set the degree of parallelism to two times the number of CPUs, then tune from there.
b. For Data Pump Export, the PARALLEL parameter value should be less than or equal to the number of dump files.
c. For Data Pump Import, the PARALLEL parameter value should not be much larger than the number of files in the dump file set.
For more details, you can refer to MOS doc 365459.1
Hope this post will help you & me when we perform Parallel datapump operations next time…
If you like this post, Plz rate it :-)
Lets suppose we have a dumpfile and we have no clue from which database or atleast which version of database it is exported. (assume we don’t have a log file)
We got a request to import it to other database. As per the compatibility matrix you can import only to higher versions and for this we need to know current dumpfile version.
In this situation, the following methods will help you….
Method # 1
__________
For classic export dump files on Unix systems, you can use below command
$ cat /tmp/scott_export.dmp | head | strings
Note that scott_export.dmp is the dumpfile name in my example
Method # 2
__________
You can generate a trace file which extracts so much of information
$ impdp DIRECTORY=dp_dir DUMPFILE=scott_export.dmp NOLOGFILE=y SQLFILE=impdp_s.sql TABLES=notexist TRACE=100300
A sample output is as follows
KUPF: In kupfioReadHeader…
KUPF: 16:31:56.121: newImpFile: EXAMINE_DUMP_FILE
KUPF: 16:31:56.121: ……DB Version = 10.02.00.03.00
KUPF: 16:31:56.121: File Version Str = 1.1
KUPF: 16:31:56.121: File Version Num = 257
KUPF: 16:31:56.131: Version CapBits1 = 32775
KUPF: 16:31:56.131: ……Has Master = 0
KUPF: 16:31:56.131: ……..Job Guid = B94302C5DAB344E1876105E3295269C6
KUPF: 16:31:56.131: Master Table Pos = 0
KUPF: 16:31:56.131: Master Table Len = 0
KUPF: 16:31:56.131: …..File Number = 1
KUPF: 16:31:56.131: ……Charset ID = 46
KUPF: 16:31:56.131: …Creation date = Thu Jul 13 15:25:32 2011
KUPF: 16:31:56.141: ………..Flags = 2
KUPF: 16:31:56.141: ……Media Type = 0
KUPF: 16:31:56.141: ……..Job Name = “SYSTEM”.”SYS_EXPORT_FULL_01″
KUPF: 16:31:56.141: ……..Platform = IBMPC/WIN_NT-8.1.0
KUPF: 16:31:56.141: ……..Language = WE8ISO8859P15
KUPF: 16:31:56.141: …….Blocksize = 4096
KUPF: 16:31:56.141: newImpFile: file; /tmp/scott_export.dmp, FID; 1
KUPF: 16:31:56.151: In expandwildcard. wildcard count = 1
KUPF: In kupfxExmDmpFile…
KUPF: In kupfuExmDmpFile…
KUPF: In kupfioReadHeader…
Note : The above method I tried on a 10g version database. Plz check on a test 9i database before yourun on prod
Method # 3
__________
From Oracle 10g, you have another way to do this. We can use DBMS_DATAPUMP.GET_DUMPFILE_INFO package to read the dumpfile header where this information will be stored. For this we need to use pre-defined stored procedure.
To get the procedure script and other details, refer to MOS doc 462488.1
Note : You can also use this procedure on a 9i database, but it will not give complete details (but you will get version)

Method # 4
__________
You can use a script which is mentioned in my previous post. For the script CLICK HERE
We might be using both Direct and Conventional path exports regularly. But here are some more interesting facts about Direct path approach…
1. Oracle introduced direct path export from its 7.3 version
2. Normally export will follow the process of SELECT statement i.e data from disk will be copied to buffer cache and then it will be written to dump file. When we use direct path by specifying DIRECT=Y in export command, then oracle will copy data directly from disk to PGA and from there it is written to dumpfile.
This is the reason why direct path is faster than conventional path
3. To improve the performance of Direct path still further, we can use RECORDLENGTH parameter in export. The values that can be possible for RECORDLENGTH are multiples of OS block size / multiples of DB_BLOCK_SIZE. If we don’t specify RECORDLENGTH and still use direct=y, then oracle will take default OS block size (In most of the env, it is 1024 bytes)
Example : Direct path can be improved by 50% when used with RECORDLENGTH of 64kb
Now, till the time we saw advantages. Lets talk about restrictions of using Direct path
1. Direct path will not work if your export release version is 8.1.4 or lower for LOB’s. Oracle will not take rows export in such case. But from 8i and above, even if you use direct on LOB’s, it will be switched to conventional path
Because of this reason, I always use direct path in my daily usage which makes export faster –> ofcourse only for 9i databases :-)
2. QUERY and BUFFER parameters cannot be used along with DIRECT=Y in export command
3. Direct path will export the data only if NLS_LANG variable is equal to database characterset. If different, it will throw following error
EXP-00041: Export done in server’s UTF8, different from user’s character set WE8ISO8859P1
EXP-00000: Export terminated unsuccessfully.
Hope this information will help you to understand better about Direct path.
Plz rate this post if you like it !
Sometimes, we may get a requirement to delete datapump jobs which are stopped abruptly due to some reason. The following steps will actually help us to do that
1. First we need to identify which jobs are in NOT RUNNING status. For this, we need to use below query (basically we are getting this info from dba_datapump_jobs)
SET lines 200
SELECT owner_name, job_name, operation, job_mode,
state, attached_sessions
FROM dba_datapump_jobs
ORDER BY 1,2;
The above query will give the datapump jobs information and it will look like below
OWNER_NAME JOB_NAME            OPERATION JOB_MODE  STATE       ATTACHED
———- ——————- ——— ——— ———– ——–
SCOTT      SYS_EXPORT_TABLE_01 EXPORT    TABLE     NOT RUNNING        0
SCOTT      SYS_EXPORT_TABLE_02 EXPORT    TABLE     NOT RUNNING        0
SYSTEM     SYS_EXPORT_FULL_01  EXPORT    FULL      NOT RUNNING        0
In the above output, you can see state is showing as NOT RUNNING and those jobs need to be removed.
Note: Please note that jobs state will be showing as NOT RUNNING even if a user wantedly stopped it. So before taking any action, consult the user and get confirmed
2. we need to now identify the master tables which are created for these jobs. It can be done as follows
SELECT o.status, o.object_id, o.object_type,
       o.owner||’.'||object_name “OWNER.OBJECT”
  FROM dba_objects o, dba_datapump_jobs j
 WHERE o.owner=j.owner_name AND o.object_name=j.job_name
   AND j.job_name NOT LIKE ‘BIN$%’ ORDER BY 4,2;
STATUS   OBJECT_ID OBJECT_TYPE  OWNER.OBJECT
——- ———- ———— ————————-
VALID        85283 TABLE        SCOTT.EXPDP_20051121
VALID        85215 TABLE        SCOTT.SYS_EXPORT_TABLE_02
VALID        85162 TABLE        SYSTEM.SYS_EXPORT_FULL_01
3. we need to now drop these master tables in order to cleanup the jobs
SQL> DROP TABLE SYSTEM.SYS_EXPORT_FULL_01;
SQL> DROP TABLE SCOTT.SYS_EXPORT_TABLE_02 ;
SQL> DROP TABLE SCOTT.EXPDP_20051121;
4. Re-run the query which is used in step 1 to check if still any jobs are showing up. If so, we need to stop the jobs once again using STOP_JOB parameter in expdp or DBMS_DATAPUMP.STOP_JOB package
Some imp points:
1. Datapump jobs that are not running doesn’t have any impact on currently executing ones.
2. When any datapump job (either export or import) is initiated, master and worker processes will be created.
3. When we terminate export datapump job, master and worker processes will get killed and it doesn’t lead to data courrption.
4. But when import datapump job is terminated, complete import might not have done as processes(master & worker)  will be killed.
Hi Friends,
Yesterday i faced a typical problem while doing export/import and want to share that with you
I have a windows server which contains both 10g and 9i versions. I got a requirement to take a schema export. When i am trying i got below error
Connected to: Oracle9i Enterprise Edition Release 9.2.0.8.0 – Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 – Production
Export done in WE8MSWIN1252 character set and AL16UTF16 NCHAR character set
About to export specified users …
. exporting pre-schema procedural objects and actions
EXP-00008: ORACLE error 6550 encountered
ORA-06550: line 1, column 13:
PLS-00201: identifier ‘SYS.DBMS_CDC_EXPDP’ must be declared
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored
EXP-00083: The previous problem occurred when calling SYS.DBMS_CDC_EXPDP.schema_info_exp
EXP-00008: ORACLE error 4068 encountered
ORA-04068: existing state of packages has been discarded
ORA-04063: package body “SYS.DBMS_LOGREP_UTIL” has errors
ORA-06508: PL/SQL: could not find program unit being called
ORA-06512: at “SYS.DBMS_LOGREP_EXP”, line 1815
ORA-06512: at line 1
EXP-00083: The previous problem occurred when calling SYS.DBMS_LOGREP_EXP.schema_info_exp
EXP-00008: ORACLE error 6550 encountered
ORA-06550: line 1, column 13:
PLS-00201: identifier ‘SYS.DBMS_CDC_EXPVDP’ must be declared
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored
EXP-00083: The previous problem occurred when calling SYS.DBMS_CDC_EXPVDP.schema_info_exp
. exporting foreign function library names for user SMEGWY
. exporting PUBLIC type synonyms
. exporting private type synonyms
. exporting object type definitions for user SMEGWY
About to export SMEGWY’s objects …
. exporting database links
. exporting sequence numbers
. exporting cluster definitions
EXP-00056: ORACLE error 6502 encountered
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at “SYS.DBMS_ASSERT”, line 163
ORA-06512: at “SYS.DBMS_METADATA_INT”, line 1762
ORA-06512: at “SYS.DBMS_METADATA_INT”, line 2034
ORA-06512: at “SYS.DBMS_METADATA_INT”, line 4058
ORA-06512: at “SYS.DBMS_METADATA”, line 695
ORA-06512: at line 1
EXP-00000: Export terminated unsuccessfully
Initially i thought it was the problem with exp utility version and i found that default bin is set to 10g. Then i switched to 9i bin path and executed, but still faced same problem.After my analysis, I found that dictionary got affected because some other database was upgraded to 10g from 9i on same server.
Finally, I ran catalog.sql, catproc.sql and catexp.sql on my 9i database and then export went fine.
Sometimes you may face this situation during import also. The reasons could be
1) you are using a wrong version of import – choose the right bin location in such case
2) you are using datapump to import, but export is not taken using datapump – use only traditional import utility
3) Dictionary of target database corrupted – run the above specified scripts
This may be helpful for you too…

No comments:

Post a Comment