TECHTalksPro
  • Home
  • Business
    • Internet
    • Market
    • Stock
  • Parent Category
    • Child Category 1
      • Sub Child Category 1
      • Sub Child Category 2
      • Sub Child Category 3
    • Child Category 2
    • Child Category 3
    • Child Category 4
  • Featured
  • Health
    • Childcare
    • Doctors
  • Home
  • SQL Server
    • SQL Server 2012
    • SQL Server 2014
    • SQL Server 2016
  • Downloads
    • PowerShell Scripts
    • Database Scripts
  • Big Data
    • Hadoop
      • Hive
      • Pig
      • HDFS
    • MPP
  • Certifications
    • Microsoft SQL Server -70-461
    • Hadoop-HDPCD
  • Problems/Solutions
  • Interview Questions

Thursday, July 13, 2017

Hive Order by Vs Sort by

 Chitchatiq     7/13/2017 04:50:00 PM     Hive     No comments   

Today we will discuss about how and where we can use Order by and Sort by clause in Hive


ORDER BY:
è Forces all the data to go into the same reducer node, by doing this, Order by ensure that entire dataset is totally ordered
è Uses a single reducer to guarantee total order in output
Drawbacks:
è Single reducer will take a long time to sort very large outputs

Sort By:
è Sort the rows based on the given columns per reducer. If there are more than one reducer, then the output per reducer will be sorted
Drawbacks:
If we have more than one reducer, then order of total output is not guaranteed to be sorted.

Let’s take one simple example. Currently Dept. table has following data


First will try to run the Order by query by setting reducer count as 2


If you see above screenshot all the data got sorted based on deptno column in Ascending order.

Now will try to run Sort by command.

We can clearly see that individual reducer level results are sorted but not at complete data set level.


However, sometimes we do not require total ordering. For example, suppose you have a table called user_action_table where each row has user_id, action, and time. Your goal is to order them by time per user_id and in this situation, we can use Sort By clause



Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Google+
  •  Stumble
  •  Digg

Wednesday, July 12, 2017

How to retrive/get all the numbers from a string using SQL server

 Chitchatiq     7/12/2017 01:07:00 PM     Problems&Solutions, SQL Server     No comments   


Script for getting all the numbers from string

DECLARE @string1 varchar(100)='welcome2sql1239hello23';
DECLARE @string2 varchar(100)='',@i int=0;
WHILE @i<=LEN(@string1)
BEGIN
IF(SUBSTRING(@string1,@i,1)>='0' AND SUBSTRING(@string1,@i,1)<='9')
SET @string2=@string2+SUBSTRING(@string1,@i,1);
SET @i=@i+1;
END

SELECT @string2 AS [Numbers obtained];
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Google+
  •  Stumble
  •  Digg

Question: Find position of numbers in the given string

 Chitchatiq     7/12/2017 01:00:00 PM     Problems&Solutions     No comments   


Script for finding the position of numbers in the given string

DECLARE @input VARCHAR(100)='welcome2Sql376hi975';
DECLARE @output VARCHAR(100)=PATINDEX('%[0-9]%',@input);
DECLARE @position VARCHAR(100)=@output;
WHILE @output<LEN(@input)
BEGIN
SET @input=STUFF(@input,PARSE(@output AS BIGINT),1,'*');
SET @output=PATINDEX('%[0-9]%',@input);
SET @position=@position+','+@output;
END

SELECT @position AS [Positions];
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Google+
  •  Stumble
  •  Digg
Newer Posts Older Posts Home

Popular Posts

  • FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
    Error: Resolution:         1.       Check if Your table storage format is ORC         2.      Check if yo...
  • Greenplum Best Practises
    Best Practices: A distribution key should not have more than 2 columns, recommended is 1 column. While modeling a database,...
  • Greenplum System-Useful Queries
    Useful Queries: Query to verify the list of segments in a Greenplum system select * from gp_segment_configuration; R...
  • How to fix ERROR 1045 : Access denied for user 'root'@'localhost' (using password: YES)
    Please follow below steps to fix HDP Sandbox -Mysql root password issue: Enter mysql -u root UPDATE mysql.user SET Password=PASSWOR...
  • HADOOP - HDFS OPERATIONS
    Starting HDFS To format the configured HDFS file system, execute the following command in namenode HDFS server, $ hadoop namenode ...

Facebook

Categories

Best Practices (1) Big Data (5) BigData&Hadoop (6) DAG (1) Error 10294 (1) external tables (1) File Formats in Hive (1) Greenplum (3) Hadoop (5) Hadoop Commands (1) Hive (4) Internal tables (1) interview Questions (1) Managed tables (1) MySQL Installation (1) ORCFILE (1) org.apache.hadoop.hive.ql.exec.MoveTask (1) Powershell (1) Problems&Solutions (15) RCFILE (1) return code 1 (1) SEQUENCEFILE (1) Service 'userhome' (1) Service 'userhome' check failed: java.io.FileNotFoundException (1) SQL Server (27) sqoop (2) SSIS (1) TEXTFILE (1) Tez (1) transaction manager (1) Views (1) What is Hadoop (1)

Blog Archive

  • December (1)
  • November (1)
  • October (2)
  • September (6)
  • August (1)
  • July (3)
  • March (1)
  • February (8)
  • January (4)
  • December (9)
  • August (4)
  • July (1)

Popular Tags

  • Best Practices
  • Big Data
  • BigData&Hadoop
  • DAG
  • Error 10294
  • external tables
  • File Formats in Hive
  • Greenplum
  • Hadoop
  • Hadoop Commands
  • Hive
  • Internal tables
  • interview Questions
  • Managed tables
  • MySQL Installation
  • ORCFILE
  • org.apache.hadoop.hive.ql.exec.MoveTask
  • Powershell
  • Problems&Solutions
  • RCFILE
  • return code 1
  • SEQUENCEFILE
  • Service 'userhome'
  • Service 'userhome' check failed: java.io.FileNotFoundException
  • SQL Server
  • sqoop
  • SSIS
  • TEXTFILE
  • Tez
  • transaction manager
  • Views
  • What is Hadoop

Featured Post

TOP 100 SQL SERVER INTERVIEW QUESTIONS

SQL SERVER INTERVIEW QUESTIONS 1.       What is the Complex task that you handled in your project 2.       What are the diffe...

Pages

  • Home
  • SQL SERVER
  • Greenplum
  • Hadoop Tutorials
  • Contact US
  • Disclaimer
  • Privacy Policy

Popular Posts

  • FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
    Error: Resolution:         1.       Check if Your table storage format is ORC         2.      Check if yo...
  • Greenplum System-Useful Queries
    Useful Queries: Query to verify the list of segments in a Greenplum system select * from gp_segment_configuration; R...

Copyright © TECHTalksPro
Designed by Vasu