Pranay Rana: June 2012

Sunday, June 17, 2012

C# State machine - Yield

Yield keyword introduced in C#2.0. Yield keyword allow to create state machine and allow to iterate through the collection of objects one by one.

yield is a contextual keyword used in iterator methods in C#. yield use like following in iterator block
public IEnumerable methodname(params)
{
      foreach(type element in listofElement)
      {
         ...code for processing 
         yield return result;
      }
}
Note : here IEnumerable can be replace by IEnumerable<T>.
What yield keyword does ? - "When you process the collection by this keyword in iterator block. It pause the execution return proceeded  element or the current element of the collection. And when you call it again it start execution with the next element which in turn become current element for that call. This thing get continue till it reach the last element of collection."

Now I am going to show how you can gain some performance when make use of yield keyword.
In this example I am checking each datarow of the datatable weather it is empty or not.

Code With Yield Keyword
static void Main(string[] args)
{
     int[] arr = new int[] { 1, 2, 3 };

     DataTable table = new DataTable();
     table.Columns.Add("ItemName", typeof(string));
     table.Columns.Add("Quantity", typeof(int));
     table.Columns.Add("Price", typeof(float));
     table.Columns.Add("Process", typeof(string));
     //
     // Here we add five DataRows.
     //
     table.Rows.Add("Indocin", 2, 23);
     table.Rows.Add("Enebrel", 1, 10);
     table.Rows.Add(null, null, null);
     table.Rows.Add("Hydralazine", 1, null);
     table.Rows.Add("Combivent", 3, 5);
     table.Rows.Add("Dilantin", 1, 6);

     foreach (DataRow dr in GetRowToProcess(table.Rows))
     {
         if (dr != null)
         {                    
            dr["Process"] = "Processed";
            Console.WriteLine(dr["ItemName"].ToString() 
+ dr["Quantity"].ToString() + " : " + dr["Process"].ToString());
            //bool test = dr.ItemArray.Any(c => c == DBNull.Value);
         }
      }
      Console.ReadLine();
}
private static IEnumerable<datarow>GetRowToProcess(DataRowCollection                                                         dataRowCollection)
{
     foreach (DataRow dr in dataRowCollection)
     {
          bool isempty = dr.ItemArray.All(x => x == null || 
(x!= null && string.IsNullOrWhiteSpace(x.ToString())));

          if (!isempty)
          {
             yield return dr;
             //dr["Process"] = "Processed";
          }
          else
          {
             yield return null;
             //dr["Process"] = " Not having data ";
          }
          //yield return dr;
     }
}
Code Without Yield Keyword
private static IList<datarow> GetRowToProcess(DataRowCollection dataRowCollection)
{
    List<datarow> procedeedRows = new List<datarow>();
    foreach (DataRow dr in dataRowCollection)
    {
        bool isempty = dr.ItemArray.All(x => x == null || 
                           (x!= null && string.IsNullOrWhiteSpace(x.ToString())));

        if (!isempty)
        {
          procedeedRows.Add(dr);
        }
     }
     return procedeedRows;
 }

static void Main(string[] args)
{
   //code as above function to create datatable 
   List<datarow> drs= GetRowToProcess(table.Rows);
   foreach (DataRow dr in drs)
   {
     //code to process the rows 
   } 
}

Now Difference between two code
In code (Code without yield keyword)
in this code there is extra list is get created which point to the rows which is matching the condition and than there is loop for processing each row.
Disadvantage with this code is extra list is get created which occupies the extra space i.e memory as well as slow down the code.
In code (Code with yield keyword)
in this there no extra list is get created , with help yield one row at a time which is matching condition is get processed.
Advantage of the code is there is no extra list is get created and also it doesn't cause any performance problem.

Following example of linq with the yield keyword
void Main()
{
   // This uses a custom 'Pair' extension method, defined below.
   List<string> list1 = new List<string>()
 {
     "Pranay",
     "Rana",
     "Hemang",
     "Vyas"
 };
   IEnumerable<string>  query = list1.Select (c => c.ToUpper())
  .Pair()         // Local from this point on.
  .OrderBy (n => n.length);
}

public static class MyExtensions
{
 public static IEnumerable<string> Pair (this IEnumerable<string> source)
 {
  string firstHalf = null;
  foreach (string element in source)
  if (firstHalf == null)
   firstHalf = element;
  else
  {
   yield return firstHalf + ", " + element;
   firstHalf = null;
  }
 }
}
There is other statement besides yeild return
yield break
stops returning sequence elements (this happens automatically if control reaches the end of the iterator method body).
The iterator code uses the yield return statement to return each element in turn. yield break ends the iteration.

Constraint
The yield statement can only appear inside an iterator block, which might be used as a body of a method, operator, or accessor. The body of such methods, operators, or accessors is controlled by the following restrictions:
  • Unsafe blocks are not allowed.
  • Parameters to the method, operator, or accessor cannot be ref or out.
  • A yield statement cannot appear in an anonymous method.
  • When used with expression, a yield return statement cannot appear in a catch block or in a try block that has one or more catch clauses.

Friday, June 15, 2012

String concat with null

Source : My Higest Voted answer on StackOverflow
Question This is valid C# code
var bob = "abc" + null + null + null + "123";  // abc123
This is not valid C# code
var wtf = null.ToString(); // compiler error
Why is the first statement valid?

Answer
The reason for first one working:
From MSDN:
In string concatenation operations,the C# compiler treats a null string the same as an empty string, but it does not convert the value of the original null string.

More information on the + binary operator:

The binary + operator performs string concatenation when one or both operands are of type string.

If an operand of string concatenation is null, an empty string is substituted. Otherwise, any non-string argument is converted to its string representation by invoking the virtual `ToString` method inherited from type object.

If ToString returns null, an empty string is substituted.

The reason of the error in second is:
null (C# Reference) - The null keyword is a literal that represents a null reference, one that does not refer to any object. null is the default value of reference-type variables.


Thursday, June 14, 2012

Concat() vs Union()

Recently I worked with the two method on my enumeration object that are Union() and Concat(). This methods used to mostly used by developer to combine two collection in single collection, but that's not true here in this post I am going to show the actual difference between this two methods.

Def. from MSDN
Enumerable.Concat  - Concatenates two sequences.
Enumerable.Union    - Produces the set union of two sequences by using the default equality comparer.

If you read the def. carefully you actually find difference between two methods. Now to understand it better way have look to below example
int[] ints1 = { 1, 2, 3 };
int[] ints2 = { 3, 4, 5 };
IEnumerable union = ints1.Union(ints2);
Console.WriteLine("Union");
foreach (int num in union)
{
   Console.Write("{0} ", num);
}
Console.WriteLine();
IEnumerable concat = ints1.Concat(ints2);
Console.WriteLine("Concat");
foreach (int num in concat)
{
   Console.Write("{0} ", num);
}
Output


The output shows that Concat() method just combine two enumerable collection to single one but doesn't perform any operation/ process any element just return single enumerable collection with all element of two enumerable collections.

Union() method return the enumerable collection by eliminating the duplicate i.e just return single element if the same element exists in both enumerable collection on which union is performed.

Important point to Note
  • By this fact we can say that Concat() is faster than Union() because it doesn't do any processing. 
  • But if after combining two collection using Concat() having single collection with too many number of duplicate element and if you want to perform further operation on that created collection takes longer time than collection created using Union() method, because Union() eliminate duplicate and create collection with less elements.

Wednesday, June 6, 2012

Format Number To Display

Number of time the end user / client require to display numeric data in different format. In this post I am going to discuss about the various type of the custom format that provided by C#.net to achieve requirement.
Here I am going to discuss each format one by one

"0" Custom Specifier
ToString("00000") - format put the leading 0 when number get display if digits in number less than the number of zero specified. when the below code is get executed output display 01234 because the number of digit less than the number of zero.
double value;
value = 1234;
Console.WriteLine("Format 1:  " + value.ToString("00000"));
Console.WriteLine();
ToString("00.00") - format do same thing as above replace zero if the number digit less , but the zero after decimal point allows to display digit equals number of zero after decimal if the number of digit less than display zero in place of that. output of the following code is 01.24 i.e only 2 digit allowed after decimal point. Note : - decimal point is display as per specified culture.
value = 1.235;
Console.WriteLine("Format 2:  " + value.ToString("00.00", 
                CultureInfo.InvariantCulture));
Console.WriteLine();
ToString("0,0") - format cause to display comma between number. here when the following code is get executed , is get display after every three digit 1,234,567,890. Note : - in this comma get replace by the culture.
value = 1234567890;
Console.WriteLine("Format 3:  " + value.ToString("0,0",
                CultureInfo.InvariantCulture));
Console.WriteLine();
ToString("0,0.0") - format is combination of above format.
value = 1234567890.123456;
Console.WriteLine("Format 4:  " + value.ToString("0,0.0", 
                CultureInfo.InvariantCulture));
Console.WriteLine();

Output

"#" Custom Specifier
This specifier does same thing as "0" specifier do but the basic difference is it doesn't anything if the number is no equal to # where as "0" specifier replace 0 if digit not exists.
ToString("#####") or ToString("#") - format allow to replace each digit by #, output of below code execution is 9867752985 which is differ from "0" specifier.
value = 9867752985;
Console.WriteLine(value.ToString("#####"));
Console.WriteLine(value.ToString("#"));
Console.WriteLine();
Below code does same as above but format number by replace each # by each digit. so the output of the code is 98-67-75.
value = 986775;
Console.WriteLine(value.ToString("[##-##-##]"));
Console.WriteLine();
Below code execution again format number and output display it as (986) 77-52985
value = 9867752985;
Console.WriteLine(value.ToString("(###) ###-####"));
Console.WriteLine();
Output


"." Custom Specifier
This specifier is already discuss with the "0" specifier this useful to display numeric number if the digit is not present than 0 get display on that place i.e at the start or end. so the output of below code is "01.20" and if I set value = 12 than the output is "12.00". But here I set culture info so that output is "01,20" so "." get replace by ",".
value=1.2
Console.WriteLine(value.ToString("00.00",CultureInfo.CreateSpecificCulture("da-DK")));

"%" Custom Specifier
Format cause the number to display with "%" and multiply number with 100. so if the number is 0.86 after applying % specifier it get display as "8.6%" if the number is 86 than output will be "8600%". Below code execution does the same thing
value = .086;
Console.WriteLine(value.ToString("#0.##%", CultureInfo.InvariantCulture));
Console.WriteLine();

value = .869;
Console.WriteLine(value.ToString("00.##%", CultureInfo.InvariantCulture));
Console.WriteLine();

value = 86;
Console.WriteLine(value.ToString("#0.##%", CultureInfo.InvariantCulture));
Console.WriteLine();
Oputput


"‰" Custom Specifier
per mille character (‰ or \u2030) format multiply the number by 1000. So the output of below code execution is 3.54‰ . But I don't think this format is useful somewhere.
value = .00354;
string perMilleFmt = "#0.## " + '\u2030';
Console.WriteLine(value.ToString(perMilleFmt, CultureInfo.InvariantCulture));

"E" and "e" Custom Specifiers
This cause number to be display in scientific notation format with the "E or e" symbol.
value = 86000;
Console.WriteLine(value.ToString("0.###E+0", CultureInfo.InvariantCulture));
Console.WriteLine();

Console.WriteLine(value.ToString("0.###E+000", CultureInfo.InvariantCulture));
Console.WriteLine();

value = -80;
Console.WriteLine(value.ToString("0.###E-000", CultureInfo.InvariantCulture));
Console.WriteLine();
Output


";" Section Separator
This allow to display number according to number sign. As you can see in below code the fmt variable which is format I am going to apply on my number here first format before ; is for positive number , second format is for negative number and last format is for the zero value.
Basically its "Positive;negative;zero" format.
You can see the what it does in output of this code.
double posValue = 1234;
double negValue = -1234; 
double zeroValue = 0;

string fmt = "+##;-##;**Zero**";

Console.WriteLine("value is positive : " + posValue.ToString(fmt));    
Console.WriteLine();

Console.WriteLine("value is negative : " +negValue.ToString(fmt));    
Console.WriteLine();

Console.WriteLine("value is Zero : " + zeroValue.ToString(fmt));
Console.WriteLine();
Output


Source :  http://msdn.microsoft.com/en-us/library/0c899ak8.aspx