Friday, March 30, 2012
Partial Replication
Is it possible to replicate just some records in a table and can the filter
be implemented (I am trying to avoid a situation that employees will have
the full version database while they are away from the office [they only
need te records that are assigned to them])
Thank you,
Shmuel Shulman
SBS Technologies LTD
See "replication, filtering options" in BOL.
"S Shulman" <smshulman@.hotmail.com> wrote in message
news:uB8E%2378hFHA.2472@.TK2MSFTNGP15.phx.gbl...
> Hi
> Is it possible to replicate just some records in a table and can the
> filter be implemented (I am trying to avoid a situation that employees
> will have the full version database while they are away from the office
> [they only need te records that are assigned to them])
> Thank you,
> Shmuel Shulman
> SBS Technologies LTD
>
|||You can filter replication horizontally or vertically using a filter
you create.
Example:
You have a table that has a column "User"
In the merge agent parameters use option -Hosname [Dave] ( or whatever
value works)
Then in the filter clause use:
User = hostname()
That particular subscriber will only get records where User = "Dave"
This is for merge and I'm not sure about the other types of
replication.
|||Thanks all for your help
Shmuel
<seanbell68@.gmail.com> wrote in message
news:1121305341.878967.300120@.o13g2000cwo.googlegr oups.com...
> You can filter replication horizontally or vertically using a filter
> you create.
> Example:
> You have a table that has a column "User"
> In the merge agent parameters use option -Hosname [Dave] ( or whatever
> value works)
> Then in the filter clause use:
> User = hostname()
> That particular subscriber will only get records where User = "Dave"
> This is for merge and I'm not sure about the other types of
> replication.
>
Wednesday, March 28, 2012
Parsing Variable Length Delimited Records
The problem is though each segment has a defined number of fields, N, the standard states that if the final M fieds are empty/blank they are not to be sent. Thus, a segment defined to have 20 fields may have 6 the first time I see it, 13 the next time, etc. To access the columns in VBScript I use DTSSource("Col001"). This works as long as the columns are there, but gives an error when they are not. Is there a parameter telling me how many columns are defined? Or is there something akin to IFEXISTS("Colxxx") or exceptions?
How can I handle this situation? One suggestion has been to pass the entire segment to the Transforms section and break it up there.
Finally, what resources can yuo point me to for reference? I'd like to get good at using DTS since my client wants their project written for it.
Thanks for yuor help,
--greg
SSIS has built-in functionality for importing text files although that functionality doesn't handle variable number of columns too well.
Fear not though - the script component is your friend here. I highly recommend Donald Farmer's book which includes a chapter on importing text files using the script component.
-Jamie
Parsing Variable Length Delimited Records
The problem is though each segment has a defined number of fields, N, the standard states that if the final M fieds are empty/blank they are not to be sent. Thus, a segment defined to have 20 fields may have 6 the first time I see it, 13 the next time, etc. To access the columns in VBScript I use DTSSource("Col001"). This works as long as the columns are there, but gives an error when they are not. Is there a parameter telling me how many columns are defined? Or is there something akin to IFEXISTS("Colxxx") or exceptions?
How can I handle this situation? One suggestion has been to pass the entire segment to the Transforms section and break it up there.
Finally, what resources can yuo point me to for reference? I'd like to get good at using DTS since my client wants their project written for it.
Thanks for yuor help,
--greg
SSIS has built-in functionality for importing text files although that functionality doesn't handle variable number of columns too well.
Fear not though - the script component is your friend here. I highly recommend Donald Farmer's book which includes a chapter on importing text files using the script component.
-Jamie
Monday, March 26, 2012
parent/child text File Import
I have a file with a parent child records in the same file.
My game plan is to get it into a table recognize the parents and put them in the parent table and recognize the children and put them in the children table.
The problem I am having is the flat file import chokes on it because the column withs are different from line to line (parent / child).
Any thoughts or suggestions would be apprecaited.
Mardo
You'll need to import the data into the pipeline as a single wide column and then parse out the required values using a combination of the condistional split component and derived column component.
-Jamie
Friday, March 23, 2012
Parent Child Dimension Question
I have a simple parent child dimension set up where the root level records have a ParentID of <null>.
When I browse the hierarchy the root level appear correctly, but when I drill into one of the levels of the hierarchy, the root level element is duplicated at the next level even though there is no such relationship in the dimension.
Here is a simplified example of records in the dimension:
ID Name ParentID
1 Root1 <NULL>
2 Level1 1
3 Root2 <NULL>
4 Level2 3
The hierarchy looks like this when I browse:
All
- Root1
- Root1
- Level1
- Root2
- Root2
- Level2
How can I keep the root elements from showing up at the 2nd level?
Did you set the "MembersWithData" property of the Parent attribute to "NonLeafDataHidden"?
http://msdn2.microsoft.com/en-us/library/ms174919(SQL.90).aspx
>>
MembersWithData
Used by parent attributes to determine whether to display data members for non-leaf members in the parent attribute. This property value is only used when the value of the Usage property is set to Parent, meaning that a parent-child hierarchy has been defined.
>>
|||I didnt think that was the problem, but I did change the "MembersWithData" property from "NonLeafDataVisible" to "NonLeafDataHidden" and I still have the same problem.
I also noticed that the problem is not isolated to the root level members either, but each level repeats it's parent.
So my hierarchy really looks like:
All
- Root1
+ Root1 (leaf)
- Level1
- Level1 (leaf)
- Root2
+ Root2 (leaf)
- Level2
- Level2 (leaf)
When it should look like
All
- Root1
- Level1
- Leaf1
- Root2
- Level2
- Leaf2
I also have the appropriate primary and foreign key relationships defined in this dimension table.
Any other ideas?
Thanks in advance.
|||Well, in my case, changing the "MembersWithData" property to "NonLeafDataHidden" fixed this issue (it is "NonLeafDataVisible" by default, according to BOL). So, do you find that switching this property on the Parent attribute has no effect on the visible hierarchy at all - which would be surprising?
PS: you can confirm this in the Adventure Works Employee dimension:
- on the Employees attribute, change the "MembersWithData" property to "NonLeafDataVisible"
- deploy the Employee dimension change and then browse the Employees parent-child hierarchy:
- you should now see each non-leaf node (starting with the root) also listed as a leaf child of itself
- switch the property back to "NonLeafDataHidden", re-deploy the dimension, and check by browsing
Partitioning performance on one disk?
Hi,
I know partitioning improves the performance on large tables.
We have a table of app. 100.000.000 records.
We want to partition the table, but we were wondering if there is still an increase in performance if all the partitions are on the same disk. We want to partition the data per month, but we don't have any disks left were we can spread the partitions.
So my main question is:
Is there still a performance increase when you use partitioning on one disk in stead of multiple disks.
Thx!
Regards Tony
sure. though the gain is maximized when you partition across multiple disks and the system is running on multi cpu.
Wednesday, March 21, 2012
Partitioning - Execution Plan
I'm considering creating a Partitioned view on a Sales table that
contains up to 20 Millions records. I'm doing some test and have a result
I'm questionning when looking at the Execution Plan generated.
First, let me explain a bit my structure. In my Sales table, I have
Ex-Factory sales information and Externally provided information. There
distinguised using an identifier (0=Ex Factory, 1=External).
I've created 8 Tables
SalesTSA1999 Check Constraint (Origin=1 And BilledDate
BETWEEN '19990101' And '19991231')
SalesTSA2000 Check Constraint (Origin=1 And BilledDate
BETWEEN '20000101' And '20001231')
SalesTSA2001 Check Constraint (Origin=1 And BilledDate
BETWEEN '20010101' And '20011231')
SalesTSA2002 Check Constraint (Origin=1 And BilledDate
BETWEEN '20020101' And '20021231')
SalesXFactory1999 Check Constraint (Origin=0 And BilledDate BETWEEN
'19990101' And '19991231')
SalesXFactory2000 Check Constraint (Origin=0 And BilledDate BETWEEN
'20000101' And '20001231')
SalesXFactory2001 Check Constraint (Origin=0 And BilledDate BETWEEN
'20010101' And '20011231')
SalesXFactory2002 Check Constraint (Origin=0 And BilledDate BETWEEN
'20020101' And '20021231')
and a view that include all theses tables using a UNION ALL:
viewSalesPartitioned
Then, I'm issuing the following Query:
Select Sum(Amount)
From viewSalesPartitioned
Where Origin=1 AND BilledDate BETWEEN '20020101' AND '20021231' --No worry
about Time, it's all set to 00:00:00
When looking at the Execution Plan in Query Analyzer I can see the execution
plan is "almost" getting the perfect query. I mean, on each table I can see
a "Filter Cost 0%" Except for SalesXFactory2002 where there's a table scan
with a cost of 6%. In my opinion it should do a filter like on the other
tables based on the check constraint on the column Origin. Maybe there's
something I don't understand properly (I'm just starting using the Execution
Plan analysis which is sometimes a bit confusing for me...)
Any help would be appreciated. Thanks!Christian Hamel wrote:
> Hello group,
> I'm considering creating a Partitioned view on a Sales table that
> contains up to 20 Millions records. I'm doing some test and have a
> result I'm questionning when looking at the Execution Plan generated.
> First, let me explain a bit my structure. In my Sales table, I have
> Ex-Factory sales information and Externally provided information.
> There distinguised using an identifier (0=Ex Factory, 1=External).
> I've created 8 Tables
> SalesTSA1999 Check Constraint (Origin=1 And BilledDate
> BETWEEN '19990101' And '19991231')
> SalesTSA2000 Check Constraint (Origin=1 And BilledDate
> BETWEEN '20000101' And '20001231')
> SalesTSA2001 Check Constraint (Origin=1 And BilledDate
> BETWEEN '20010101' And '20011231')
> SalesTSA2002 Check Constraint (Origin=1 And BilledDate
> BETWEEN '20020101' And '20021231')
> SalesXFactory1999 Check Constraint (Origin=0 And BilledDate
> BETWEEN '19990101' And '19991231')
> SalesXFactory2000 Check Constraint (Origin=0 And BilledDate
> BETWEEN '20000101' And '20001231')
> SalesXFactory2001 Check Constraint (Origin=0 And BilledDate
> BETWEEN '20010101' And '20011231')
> SalesXFactory2002 Check Constraint (Origin=0 And BilledDate
> BETWEEN '20020101' And '20021231')
> and a view that include all theses tables using a UNION ALL:
> viewSalesPartitioned
> Then, I'm issuing the following Query:
> Select Sum(Amount)
> From viewSalesPartitioned
> Where Origin=1 AND BilledDate BETWEEN '20020101' AND '20021231' --No
> worry about Time, it's all set to 00:00:00
> When looking at the Execution Plan in Query Analyzer I can see the
> execution plan is "almost" getting the perfect query. I mean, on
> each table I can see a "Filter Cost 0%" Except for SalesXFactory2002
> where there's a table scan with a cost of 6%. In my opinion it
> should do a filter like on the other tables based on the check
> constraint on the column Origin. Maybe there's something I don't
> understand properly (I'm just starting using the Execution Plan
> analysis which is sometimes a bit confusing for me...) Any help would
> be appreciated. Thanks!
Indexing on a character or bit value that allows only two values will
normally not add any performance benefit because the column selectivity
is too low. You might be better off using the date as the patitioning
column without the Origin. You could also try adding a hint to force SQL
Server to use the clustered index.
I would also change the check constraints to avoid any possibility of a
time portion triggering an error:
Check Constraint (Origin=1 And BilledDate >= '20000101' And BilledDate <
'20010101')
David Gugick
Quest Software
www.imceda.com
www.quest.com|||"Christian Hamel" <chamel@.NoSpAmMeRaLlOwEd.CoM> wrote in message
news:O%23AEf7exFHA.3720@.TK2MSFTNGP11.phx.gbl...
> Hello group,
> I'm considering creating a Partitioned view on a Sales table that
> contains up to 20 Millions records. I'm doing some test and have a result
> I'm questionning when looking at the Execution Plan generated.
> First, let me explain a bit my structure. In my Sales table, I have
> Ex-Factory sales information and Externally provided information. There
> distinguised using an identifier (0=Ex Factory, 1=External).
> I've created 8 Tables
> SalesTSA1999 Check Constraint (Origin=1 And BilledDate
> BETWEEN '19990101' And '19991231')
> SalesTSA2000 Check Constraint (Origin=1 And BilledDate
> BETWEEN '20000101' And '20001231')
> SalesTSA2001 Check Constraint (Origin=1 And BilledDate
> BETWEEN '20010101' And '20011231')
> SalesTSA2002 Check Constraint (Origin=1 And BilledDate
> BETWEEN '20020101' And '20021231')
> SalesXFactory1999 Check Constraint (Origin=0 And BilledDate BETWEEN
> '19990101' And '19991231')
> SalesXFactory2000 Check Constraint (Origin=0 And BilledDate BETWEEN
> '20000101' And '20001231')
> SalesXFactory2001 Check Constraint (Origin=0 And BilledDate BETWEEN
> '20010101' And '20011231')
> SalesXFactory2002 Check Constraint (Origin=0 And BilledDate BETWEEN
> '20020101' And '20021231')
> and a view that include all theses tables using a UNION ALL:
> viewSalesPartitioned
> Then, I'm issuing the following Query:
> Select Sum(Amount)
> From viewSalesPartitioned
> Where Origin=1 AND BilledDate BETWEEN '20020101' AND '20021231' --No worry
> about Time, it's all set to 00:00:00
> When looking at the Execution Plan in Query Analyzer I can see the
> execution
> plan is "almost" getting the perfect query. I mean, on each table I can
> see
> a "Filter Cost 0%" Except for SalesXFactory2002 where there's a table scan
> with a cost of 6%. In my opinion it should do a filter like on the other
> tables based on the check constraint on the column Origin. Maybe there's
> something I don't understand properly (I'm just starting using the
> Execution
> Plan analysis which is sometimes a bit confusing for me...)
> Any help would be appreciated. Thanks!
>
Partitioned views only support a single partitioning column.
So the check constraints should be just on the date. And in the query plan
you shouldn't even see the other partitions. What you're currently seeing
is not true partition elimination. The query is just hitting each PK index
and quickly eliminating the partitions which don't contain the that column.
Partitioned views allow the data in a large table to be split into smaller
member tables. The data is partitioned between the member tables based on
ranges of data values in one of the columns. The data ranges for each member
table are defined in a CHECK constraint specified on the partitioning
column.
http://msdn2.microsoft.com/ms248875
If you add an index on Origin, or perhaps add Origin as well as BilledDate
to your primary key, you should be able to eliminate the table scan. You
will get partition elimination to aviod hitting the other time period, but
normal index s

David|||My TinyInt column is not part of the index.
I thought that since there was a Check constraint on the table that allow
only a specific value, SQL Server would be kind enough to don't scan the
records based on the constraint and the value specified in my query.
Thanks for the reply.
"David Gugick" <david.gugick-nospam@.quest.com> a crit dans le message de
news: %23NC0dEfxFHA.3236@.TK2MSFTNGP14.phx.gbl...
> Christian Hamel wrote:
> Indexing on a character or bit value that allows only two values will
> normally not add any performance benefit because the column selectivity is
> too low. You might be better off using the date as the patitioning column
> without the Origin. You could also try adding a hint to force SQL Server
> to use the clustered index.
> I would also change the check constraints to avoid any possibility of a
> time portion triggering an error:
> Check Constraint (Origin=1 And BilledDate >= '20000101' And BilledDate <
> '20010101')
>
> --
> David Gugick
> Quest Software
> www.imceda.com
> www.quest.com|||Ok, I see then. I'll check what I can do.
Thank you very much. I'll go to bed more intelligent tonight :)
"David Browne" <davidbaxterbrowne no potted meat@.hotmail.com> a crit dans
le message de news: uzk3sGfxFHA.1032@.TK2MSFTNGP12.phx.gbl...
> "Christian Hamel" <chamel@.NoSpAmMeRaLlOwEd.CoM> wrote in message
> news:O%23AEf7exFHA.3720@.TK2MSFTNGP11.phx.gbl...
> Partitioned views only support a single partitioning column.
> So the check constraints should be just on the date. And in the query
> plan you shouldn't even see the other partitions. What you're currently
> seeing is not true partition elimination. The query is just hitting each
> PK index and quickly eliminating the partitions which don't contain the
> that column.
> Partitioned views allow the data in a large table to be split into smaller
> member tables. The data is partitioned between the member tables based on
> ranges of data values in one of the columns. The data ranges for each
> member table are defined in a CHECK constraint specified on the
> partitioning column.
> http://msdn2.microsoft.com/ms248875
> If you add an index on Origin, or perhaps add Origin as well as BilledDate
> to your primary key, you should be able to eliminate the table scan. You
> will get partition elimination to aviod hitting the other time period, but
> normal index s

> David
>
>
Monday, February 20, 2012
parameterized query that counts the number of null values in an int32 column
cmd.Add(new SqlCeParameter("@.IntColumn", SqlInt32.Null));
cmd.CommandText = "select count(*) from TableName where IntColumn is not @.IntColumn";
cmd.Add(new SqlCeParameter("@.IntColumn", DBNull.Value));
cmd.CommandText = "select count(*) from TableName where IntColumn is not @.IntColumn";
cmd.Add(new SqlCeParameter("@.IntColumn", SqlInt32.Null));
cmd.CommandText = "select count(*) from TableName where not IntColumn = @.IntColumn";
cmd.Add(new SqlCeParameter("@.IntColumn", DBNull.Value));
cmd.CommandText = "select count(*) from TableName where not IntColumn = @.IntColumn";
cmd.Parameters.Add(new SqlCeParameter("@.IntColumn", SqlDbType.Int32));
cmd.Parameters["@.IntColumn"].Value = SqlInt32.Null;
cmdGetNumRead.CommandText = "select count(*) from TableName where IntColumn is not @.IntColumn";
cmd.Parameters.Add(new SqlCeParameter("@.IntColumn", SqlDbType.Int32));
cmd.Parameters["@.IntColumn"].Value = SqlInt32.Null;
cmdGetNumRead.CommandText = "select count(*) from TableName where not IntColumn = @.IntColumn";
cmd.Parameters.Add(new SqlCeParameter("@.IntColumn", SqlDbType.Int32));
cmd.Parameters["@.IntColumn"].Value = DBNull.Value;
cmdGetNumRead.CommandText = "select count(*) from TableName where IntColumn is not @.IntColumn";
cmd.Parameters.Add(new SqlCeParameter("@.IntColumn", SqlDbType.Int32));
cmd.Parameters["@.IntColumn"].Value = DBNull.Value;
cmdGetNumRead.CommandText = "select count(*) from TableName where not IntColumn = @.IntColumn";
cmd.Parameters.Add(new SqlCeParameter("@.IntColumn", SqlDbType.Variant));
cmd.Parameters["@.IntColumn"].Value = SqlInt32.Null;
cmd.CommandText = "select count(*) from Meter where IntColumn is not @.IntColumn";
cmd.Parameters.Add(new SqlCeParameter("@.IntColumn", SqlDbType.Variant));
cmd.Parameters["@.IntColumn"].Value = SqlInt32.Null;
cmd.CommandText = "select count(*) from Meter where not IntColumn = @.IntColumn";
md.Parameters.Add(new SqlCeParameter("@.IntColumn", SqlDbType.Variant));
cmd.Parameters["@.IntColumn"].Value = DBNull.Value;
cmd.CommandText = "select count(*) from Meter where IntColumn is not @.IntColumn";
cmd.Parameters.Add(new SqlCeParameter("@.IntColumn", SqlDbType.Variant));
cmd.Parameters["@.IntColumn"].Value = DBNull.Value;
cmd.CommandText = "select count(*) from Meter where not IntColumn = @.IntColumn";
Whenever I use a "is not" in the query I get an exception, otherwise it returns a count of 0 which is not accurate. Any help will be appreciated.
try using this syntax:
WHERE [columnName] != DBNull.Value or WHERE [columnName] <> DBNull.Value
Darren
|||I could not get that to work. Could you provide a more detailed examle of what the c# code would actually look like?|||
I think you've found yourself a bug in the ADO provider for SQL Mobile. I tried this code on the Northwind database and it reports 0 rows (should be 22):
SqlCeConnection cn = new SqlCeConnection(@."Data Source = \program files\deviceapplication7\Northwind.sdf");
try
{
cn.Open();
SqlCeCommand cmd = new SqlCeCommand("SELECT count(*) FROM CUSTOMERS WHERE FAX = @.fax", cn);
SqlCeParameter p1 = new SqlCeParameter("@.fax", DBNull.Value);
cmd.Parameters.Add(p1);
object o = cmd.ExecuteScalar();
if ( o != null && o != DBNull.Value)
{
MessageBox.Show("There were " + ((int)o).ToString() + " null fax numbers");
}
else
MessageBox.Show("Result of query was null");
This only occurs with a parameterized query however. If you use the normal syntax (IS NULL) in your query, you'll get the right results.
Darren
|||
Thanks for your response Daren. I started a new project and ran a program a program with the very similar code as what you tried and I got the same result. I was then wondering if the limitation/bug might be related to using SqlCeCommand.ExecuteScalar with parameterized queries in general rather than specifically looking for null or not null. So I tried this program and it returned a count of 1 which is correct:
SqlCeConnection conn = new SqlCeConnection(@."Data Source= \Program Files\BugTesting\Northwind.sdf;");
conn.Open();
SqlCeCommand cmd = conn.CreateCommand();
cmd.Parameters.Add(new SqlCeParameter("@.fax", SqlDbType.NVarChar, 24));
cmd.CommandText = "select COUNT(*) from Customers where Fax = @.fax";
cmd.Prepare();
cmd.Parameters["@.fax"].Value = "030-0076545";
object o = cmd.ExecuteScalar();
if (o != null && o != DBNull.Value)
{
MessageBox.Show("There were " + ((int)o).ToString() + " null fax numbers");
}
else
{
MessageBox.Show("Result of query was null");
}
conn.Close();
conn.Dispose();
This shows that you can do parameterized queries with count and execute scalar, but I still have not found a way to count the number of null values. I don't really know how parameterized queries work under the hood, but I would guess that the code that Darren posted would translate into something like this:
SELECT count(*) FROM CUSTOMERS WHERE FAX = null
If this is true then that would explain why it returns a count of 0. Which is why in my original post I was trying things more along the lines of: cmd.CommandText = "select count(*) from Customers where Fax is @.fax"
but that generates an SqlCeException. Is this a limitation, a bug, or is there still maybe another way to do this? Anyway the only reason why I was trying to do this as a parameterized query is because in my code runs it multiple times in a row, and I thought it might run faster this way. Even if I could get this to work somehow do think that it would improve the execution speed of the query? Also where is the best place to report this as a potential bug?