《课程名称》实验教学大纲

课程代码

045102751

课程名称

大数据技术

英文名称

Big Data Technology

课程类别

专业领域课

课程性质

选修

学时

总学时:40上机学时:12实验学时:0实践学时:0

学分

2.5

开课学期

第七学期

开课单位

计算机科学与工程学院

适用专业

计算机科学技术、网络工程、信息安全

授课语言

中文授课

先修课程

计算机网络,操作系统,程序设计,数据库

毕业要求(专业培养能力)

本课程对学生达到如下毕业要求有如下贡献:

1.工程知识:掌握扎实的基础知识、专业基本原理、方法和手段,能够将应用数学、自然科学、本专业基础知识和专业知识用于解决大数据的管理和分析计算问题,为大数据技术应用和相关工程实践打下基础。

2.问题分析:能够应用数学、自然科学和工程科学的基本原理,识别、表达、并通过文献研究分析大数据应用工程中的复杂问题,以获得有效结论。

3.设计/开发解决方案:能够设计针对大数据应用工程复杂问题的解决方案,包括满足特定需求的大数据系统设计、关键技术选择、应用工程实施流程或方案设计,并能够在设计环节中体现创新意识,考虑社会、健康、安全、法律、文化以及环境等因素。

4.研究:能够基于科学原理并采用科学方法对大数据应用工程复杂问题进行研究,包括设计实验、分析与解释数据、并通过信息综合得到合理有效的结论。

5.使用现代工具:能够针对大数据应用工程复杂问题,开发、选择与使用恰当的技术、资源、现代工程工具和信息技术工具,包括对复杂问题的预测与模拟,并能够理解其局限性。

课程培养学生的能力(教学目标)

完成课程后,学生将具备以下能力:

1)掌握分布式计算技术、大数据的分析计算模型、存储平台、分析处理技术、编程开发技术的基本知识,培养学生发现问题、解决问题的基本能力。[12

2)掌握大数据存储管理、加工处理和分析计算的基本原理和基本技术,学生具有大数据的分析管理基本能力。[134

3)掌握常用的大数据编程和应用开发技术,并具有初步大数据应用系统设计能力,培养学生的大数据技术应用实践能力。[35

课程简介

本课程主要面向有一定的计算机网络,操作系统,程序设计和数据库基础知识,并且具有一定软件开发能力的高年级学生。课程主要介绍传统分布式计算的基本原理和基本开发技术,大数据存储管理和平台架构技术,大数据计算模型和分析处理算法原理,以及大数据系统构建和应用开发技术。课程需要学生阅读大量的相关文献来获得对技术的理解,还要求学生通过完成一系列实验来掌握大数据编程实践和分析处理技术方法及工具。通过本课程的学习,希望学生能够在了解和掌握大数据管理平台和分析处理技术的基础上,学会应用大数据处理技术解决现实数据处理、分析和应用问题。课程的知识模块包括分布式计算基础知识、分布式计算编程技术、大数据存储平台技术、大数据的计算模型、大数据分析处理技术、大数据编程开发技术、大数据应用开发技术七个方面。

主要仪器设备与软件

设备:PC服务器

软件:Java开发环境软件、Hadoop生态软件等

实验报告

要求给出实验的方法、步骤、过程和结论。

考核方式

实验报告:50

实验操作:50

教材、实验指导书及教学参考书目

建议教材:林伟伟,刘波编著《分布式计算、云计算与大数据》,机械工业出版社,2017年,第二版次。

主要参考资料:

[1]杨正洪著,《大数据技术入门》,清华大学出版社,2016

[2]林子雨编著,《大数据技术原理与应用(第2版)》,人民邮电出版社出版,2017.

[3]张良均等著,《Hadoop大数据分析与挖掘实战》,机械工业出版社,2015

[4]M.L.Liu著,《分布式计算原理和应用》,清华大学出版社,2004

[5]孙宇熙著,《云计算与大数据》,人民邮电出版社,2017

[6]刘鹏著,《大数据》,电子工业出版社,2017

制定人及发布时间

林伟伟,201776


《课程名称》实验教学内容与学时分配

实验项目编号

实验项目名称

实验学时

实验内容提要

实验类型

实验要求

每组人数

主要仪器设备与软件

1

分布式计算程序设计

4

基于SocketAPIJavaRMI客户服务器通信程序,通过客户端程序对服务器程序的调用,实现简单信息查询功能(如对服务器的文件信息查询)。

设计性

必做

1

PC机、JAVA开发环境

2

大数据基本操作

4

掌握分布式文件系统HDFS的文件基本操作,熟悉MapReduce程序运行方法,掌握HBase数据库基本操作和Hive数据仓库基础使用,并能设计简单的大数据存储程序(如HDFSHBase数据存储与读取程序)。

演示性

必做

1-2

PC服务器、Hadoop生态软件

3

日志大数据分析计算

4

使用MapReduceHive工具分析日志大数据(如手机用户上网日志数据),实现日志的基本查询和统计功能(如通过统计用户上网日志数据TOPURL功能,实现用户上网偏好分析)。

综合性

必做

1-2

PC服务器、Hadoop生态软件

















…………

…………


…………








BigData Technology” ExperimentSyllabus

Course Code

045102715

Course Title

Big Data Technology

Course Category

Specialty-related Course

Course Nature

Elective Course

Class Hours

Total: 40  laboratorialpractice: 12  experiments: 0  field practice: 0

Credits

2.5

Semester

Seventhterm

Institute

School of Computer Scienceand Technology

Program Oriented

Computer Science andEngineering, Network Engineering, Information Science

Teaching Language

Chinese

Prerequisites

Computer Network”,“Operation System”, “Program designing” , “DatabaseSystem”

Student Outcomes (SpecialTraining Ability)

Thiscourse contributes to the students’ ability from the aspects asfollows:

1.Engineering knowledge: students will learn the fundamentalknowledge, basic professional principles, methodologies andtechniques. Students will be trained to solve the problems in bigdata management and process by applying mathematics and theirprofessional knowledge in the scope of computer science. Thecourse enhances students’ ability to develop big dataapplications.

2.Problem analysis: students will learn to define, express andanalyze the comprehensive problems in big data engineering bydoing survey and applying mathematics, engineering techniques andtheir professional knowledge in the scope of computer science.

3.Problem solving: students will learn how to find the comprehensivesolutions to the problems in big data engineering including thedesign of big data system, selection of critical techniques,implementation of workflows and planning. Students are promoted ininnovative awareness through considering multiple factors (e.g.,society, environment and security) in their designs.

4.Research ability: students will learn to do research on theproblems in big data engineering by adopting scientificmethodologies including experiments, data analysis and conclusionmaking.

5. Utilizing moderntechniques: students will learn to select, utilize and developtools and techniques available to anticipate and simulate problemsin big data engineering.

Teaching Objectives

Afterfinishing the course:

(1)Students should master the basic knowledge of distributedcomputing techniques, big data processing models, storageplatforms, programming techniques and be trained in problemdiscovering and resolving. [I, II]

(2)Students should master the basic methods and techniques forstoring, processing and analyzing big data. [II, III, IV]

(3) Students should masterwidely-used big data programming and be trained in designing andprogramming simple big data systems. [III, V]

Course Description

This course isprepared for upperclassmen who have a good mastery of the basicsof computer network, operating system, program design and databaseas well as have capability to develop an application. Theobjective of this course is to introduce the basic principles anddevelopment technology of traditional distributed computing, thestorage and management of big data, platform for big data, themodel of big data computing, principles of algorithm to analyzebig data and how to design a framework for big data system as wellas the application development technology. Students in this courseshould to read a lot of relevant literature about big data, inorder to form a perception of the technology. Besides, studentsneed to do some experiment which is necessary to master how to usetools to analyze and program for big data. We hope student candiscover, solve and apply the technology of big data during thereal work instead of just knowing the basic principles of managingbig data platforms or the way to analyze. The knowledge modules ofthe course include basic knowledge of distributed computing,technology of distributed computing programming, technology of bigdata storage platform, computational model for big data, big dataanalysis and processing technology, technology of big dataprograming development, and technology of big data applicationdevelopment.

Instruments and Equipments

Equipment:PC server

Software: JavaDevelopment KitHadoopDevelopment Environment

Experiment Report

The method,procedure, process and conclusion of experiment are required

Assessment

ExperimentReport: 50%

ExperimentalOperation: 50%

Teaching Materials andReference Books

SuggestedTextbooks:

林伟伟,刘波编著《分布式计算、云计算与大数据》,机械工业出版社,2017年,第二版次。

MainReferences:

[1]杨正洪著,《大数据技术入门》,清华大学出版社,2016

[2]林子雨编著,《大数据技术原理与应用(第2版)》,人民邮电出版社出版,2017.

[3]张良均等著,《Hadoop大数据分析与挖掘实战》,机械工业出版社,2015

[4]M.L. Liu著,《分布式计算原理和应用》,清华大学出版社,2004

[5]孙宇熙著,《云计算与大数据》,人民邮电出版社,2017

[6]刘鹏著,《大数据》,电子工业出版社,2017

Prepared by Whomand When

Lin Weiwei, 6 July 2017.

BigData Technology” ExperimentalTeaching Arrangements

No.

ExperimentItem

Class Hours

ContentSummary

Category

Requirements

Number ofStudentsEach Group

Instruments,Equipments and Software

1

Distributed ComputingProgram Design

4

Preparing Client/Server’scommunication program with Socket API or Java RMI, and realize thesimple function of information inquiry (e.g. query the informationof files on the server)

Design

Compulsory

1

PC\Java DevelopmentEnvironment

2

Basic Operation of Big Data

4

Master the basic operationof distributed file system HDFS, be familiar with how the programof MapReduce run, and master the basic operation of HBase databaseand how to use Hive data warehouse, as well as be able to design asimple program for big data storage (e.g. the program to read orstore data from HDFS or HBase)

Demonstration

Compulsory

1-2

PC Server\ HadoopDevelopment Environment

3

The Analysis and Computingof Massive Log Data

4

Query and analyze the logdata by using the tools of MapReduce or Hive which are designedfor this (e.g. discover the preference of users when their surfingthe Internet by analyzing the TOP URL in the log data)

Comprehensive

Compulsory

1-2

PC Server\ HadoopDevelopment Environment

















……

……