Re: creating view to select unique records



On Jan 31, 7:41 am, "set...@xxxxxxxxx" <set...@xxxxxxxxx> wrote:
Hello All,

I am unsure if similar question has already been posted to this group.
If so please point me to the relevant post/group.. otherwise please
read on...

We are in the process of migrating our application from Sybase to
Oracle. Basically for each table, we would do an bcp out of the table
and load it into oracle via sqlldr.

The source table has duplicate rows in it. However, we would not want
to export those dupe rows to our oracle table.

My intention is to create a view that would pick up only the unique
records and I can bcp the data out of the view.

I can create a view something like
select distict * from sybase_table. However, I am worried about the
performance since I have heard that distinct clause is a costly
operation ( I am working on about 120-140 million rows ).

Is there a better way to achieve it?

Thanks in advance
Arun

Arun:

Are the duplication criteria just a few columns or all columns? If
just a few, it might be better to create a unique constraint on those
columns on the target (Oracle) side and allow SQL*Loader to handle the
rejected duplicates. Naturally, having any indexes (which is, of
course, how unique/PK constraints are enforced) on the target table
can slow loads into that table, so you should use a DIRECT PATH load
in SQL*Loader.

Another approach might be to delete the duplicate rows out of the
source table (or a copy of it if you're nervous ;-) first. Of course,
deleting from such a large table will also be costly, but maybe not as
costly as the DISTINCTing.

Finally, you may want to dump the source data to a text file, use any
myriad of (fast) text-manipulation utilities to remove the duplicates,
then use SQL*Loader to load the cleansed text file into the target.

HTH,

--Jeff
.



Relevant Pages

  • Re: How can Access update one record out of duplicate records.
    ... Access database directly to update duplicate records. ... becuase ROWID can change especially with busy applications. ... >> have several duplicate rows. ...
    (microsoft.public.access.queries)
  • Re: Avoiding dupes when merging files
    ... >> dupes? ... >> Or you could load the data into a datatable (or hash table type ... >> the datatable should throw a duplicate PK exception which you would ... Add the primary key array to the DataTable.PrimaryKey property. ...
    (microsoft.public.dotnet.languages.vb)
  • Re: Overwriting old duplicate
    ... Use TransferText to import and DoCmd.RunQuery to run the data load. ... > I have a database table in Access 97. ... > CSR and Daily_Date ... > that duplicate with the "newer" duplicate. ...
    (microsoft.public.access.externaldata)
  • Re: How do I remove Duplicate rows?
    ... I altered the code as follows, and it will delete duplicate ... I need to delete the duplicate rows that are side-by-side and the ... Dim Numrows As Integer ... Dim lignesEgales As Boolean ...
    (microsoft.public.excel.misc)
  • Re: How can Access update one record out of duplicate records.
    ... If you added an Autonumber field to the table, ... > have several duplicate rows. ... I am wondering what does Access use to ditinguish between the duplicate ...
    (microsoft.public.access.queries)

Loading