Skip to content

Instantly share code, notes, and snippets.

@shrijeet
Created March 12, 2012 23:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shrijeet/2025436 to your computer and use it in GitHub Desktop.
Save shrijeet/2025436 to your computer and use it in GitHub Desktop.
hive merge file error decription
Hive Version: Hive 0.8 (last commit SHA b581a6192b8d4c544092679d05f45b2e50d42b45 )
Hadoop version : chd3u0
I am trying to use the hive merge small file feature by setting all the necessary params.
I am disabling use of CombineHiveInputFormat since my input is compressed text.
hive> set mapred.min.split.size.per.node=1000000000;
hive> set mapred.min.split.size.per.rack=1000000000;
hive> set mapred.max.split.size=1000000000;
hive> set hive.merge.size.per.task=1000000000;
hive> set hive.merge.smallfiles.avgsize=1000000000;
hive> set hive.merge.size.smallfiles.avgsize=1000000000;
hive> set hive.merge.mapfiles=true;
hive> set hive.merge.mapredfiles=true;
hive> set hive.mergejob.maponly=false;
The plan decides to launch two MR jobs but after first job succeeds I get runt time error
"java.lang.RuntimeException: Plan invalid, Reason: Reducers == 0 but reduce operator specified"
I think the problem can be fixed by using this patch I came with : https://gist.github.com/2025303
Of course my understanding and hence this patch can be totally wrong. Please provide feedback.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment