长沙市建设厅网站,石景山上海网站建设,南宁网站建设费用,注册品牌商标流程及费用1. 必要性Hadoop提供了多个配置参数使得admin和user可以灵活设定内存#xff1b;有些参数有defaut-value, 有些选项是cluster specific以支持memory-intensive作业。当构建一个cluster时#xff0c;admin可以先设定一些appropriate default value#xff1b;其他一些参数设定…1. 必要性Hadoop提供了多个配置参数使得admin和user可以灵活设定内存有些参数有defaut-value, 有些选项是cluster specific以支持memory-intensive作业。当构建一个cluster时admin可以先设定一些appropriate default value其他一些参数设定可根据cluster硬件配置(如任务可获得的物理内存和虚拟内存的总大小、slave配置的slots的数目、在slave上运行的process的需求)和作业类型(如内存密集型任务)而确定。2. 内存监控(1) 监控任务内存的目的防止MapReduce task占用了过量的内存(consuming memory beyond a limit)从而导致同在该slave上运行的其他进程、其他任务、或者daemon(例如DataNode或者TaskTracker)。(2) virtual memory和physical memoryHadoop可以监控节点的virtual memory和physical memory两者之间独立。然而在streaming应用中由于程序需要加载了libraries来执行任务故virtual memory使用较多。在这种情况下监控physical memory会更准确.(3) hadoop允许为作业指定期望所需内存的最大值。通过resource aware scheduling and monitoring, hadoop tries to确保满足task数量以满足限制(a) an individual jobs memory requirement(b) the total amount of memory available for all MapReduce tasks(4) TaskTracker 对task的监控(a) 周期性的监控第一步以防某个task及其child process累计使用的virtual memory和physical memory的量不超过specified的量。先查virtual memory, 接着physical memory. 若超过则kill该task及其child process。并标记该task为failed.第二步检查某个job的所有running tasks及其child processes累计使用的virtual memory和physical memory的量。若超过limit, 则kill以足够量的task直到累计内存的使用量低于limit. (若virtual memory超限则kill掉那些进展最小的tasks若physical memory超限则kill掉那些占用physical memory最多的task)。被kill掉的task被标记为killed.(5) Resource aware schedulingResource aware scheduling能确保要调度task到某个slave上前先要确保该slave能够满足task的memory requirement。Capacity Scheduling在调度作业时把virtual memory的需求考虑进去。见(7) cluster相关的内存配置这些配置与JobTracker和TaskTracker相关任何job不能修改这些参数。另外配置参数在每个slave上相同。mapreduce.cluster.{map|reduce}memory.mb: These options define the default amount of virtual memory that should be allocated for MapReduce tasks running in the cluster. They typically match the default values set for the options mapreduce.{map|reduce}.memory.mb. They help in the calculation of the total amount of virtual memory available for MapReduce tasks on a slave, using the following equation:Total virtual memory for all MapReduce tasks (mapreduce.cluster.mapmemory.mb * mapreduce.tasktracker.map.tasks.maximum) (mapreduce.cluster.reducememory.mb * mapreduce.tasktracker.reduce.tasks.maximum)Typically, reduce tasks require more memory than map tasks. Hence a higher value is recommended for mapreduce.cluster.reducememory.mb. The value is specified in MB. To set a value of 2GB for reduce tasks, set mapreduce.cluster.reducememory.mb to 2048.mapreduce.jobtracker.max{map|reduce}memory.mb: These options define the maximum amount of virtual memory that can be requested by jobs using the parameters mapreduce.{map|reduce}.memory.mb. The system will reject any job that is submitted requesting for more memory than these limits. Typically, the values for these options should be set to satisfy the following constraint:mapreduce.jobtracker.maxmapmemory.mb mapreduce.cluster.mapmemory.mb * mapreduce.tasktracker.map.tasks.maximummapreduce.jobtracker.maxreducememory.mb mapreduce.cluster.reducememory.mb * mapreduce.tasktracker.reduce.tasks.maximumThe value is specified in MB. If mapreduce.cluster.reducememory.mb is set to 2GB and there are 2 reduce slots configured in the slaves, the value formapreduce.jobtracker.maxreducememory.mb should be set to 4096.mapreduce.tasktracker.reserved.physicalmemory.mb: This option defines the amount of physical memory that is marked for system and daemon processes. Using this, the amount of physical memory available for MapReduce tasks is calculated using the following equation:Total physical memory for all MapReduce tasks Total physical memory available on the system - mapreduce.tasktracker.reserved.physicalmemory.mbThe value is specified in MB. To set this value to 2GB, specify the value as 2048.mapreduce.tasktracker.taskmemorymanager.monitoringinterval: This option defines the time the TaskTracker waits between two cycles of memory monitoring. The value is specified in milliseconds.Note: The virtual memory monitoring function is only enabled if the variables mapreduce.cluster.{map|reduce}memory.mb andmapreduce.jobtracker.max{map|reduce}memory.mb are set to values greater than zero. Likewise, the physical memory monitoring function is only enabled if the variable mapreduce.tasktracker.reserved.physicalmemory.mb is set to a value greater than zero.转自http://blog.csdn.net/amaowolf/article/details/7188504