AI-肺部结节大赛进度直播

AI的比赛是第一次参加，就当作一次实战练习了。由于比赛规则，这里也只能记录一些想法，不能将一些实际的实现方法以代码或者其他形式发布出来，请谅解。由于组委会规定，所以神经网络只能使用Caffe套件。

OpenCV @2017/04/19

官方教程第一步是进行图像分割，图像分割一般使用OpenCV来做，下面进行OpenCV的研究。
对于C#比较熟悉，所以暂时考虑在Win平台上进行研究。

在Windows上的安装

http://opencv.org/releases.html
选择Win Pack，安装其实就是一个解压缩的过程。
在这个路径中保存着所需要的可执行文件
你的解压路径\opencv\build\x64\vc14\bin
这个可执行文件需要添加到系统的Path里面去，以后其他程序可以直接调用。

EnguCV

EmguCV是一个C#用的OpenCV包装
(不知道是不是一定要安装OpenCV，可能这个东西可以脱离OpenCV直接用)
Install-Package EmguCV即可将这个包导入到项目中。

using System;
using Emgu.CV;
using Emgu.CV.CvEnum;
using Emgu.CV.Structure;


namespace OpenCV
{
    class Program
    {
        static void Main(string[] args)
        {
            String win1 = "Test Window"; //The name of the window
            CvInvoke.NamedWindow(win1); //Create the window using the specific name

            Mat img = new Mat(200, 400, DepthType.Cv8U, 3); //Create a 3 channel image of 400x200
            img.SetTo(new Bgr(255, 0, 0).MCvScalar); // set it to Blue color

            //Draw "Hello, world." on the image using the specific font
            CvInvoke.PutText(
               img,
               "Hello, world",
               new System.Drawing.Point(10, 80),
               FontFace.HersheyComplex,
               1.0,
               new Bgr(0, 255, 0).MCvScalar);


            CvInvoke.Imshow(win1, img); //Show the image
            CvInvoke.WaitKey(0);  //Wait for the key pressing event
            CvInvoke.DestroyWindow(win1); //Destroy the window if key is pressed
        }
    }
}

看一下是否能够运行，出现一个窗体如下所示。代码暂时不去理解了。

Emgu的HelloWorld

SimpleITK@2017/04/27

本次比赛的CT数据格式为MHD文件，由于采样用设备不同，每个文件的大小也不同。最小的在50M，最大的在250M左右。每个MHD（1K的小文件）和ZRAW文件为一套数据。整个比赛的文件大小约为50G，计算资源的多少是胜出的关键因素。
MHD的内容如下所示，保存着CT图片的一些摘要信息

ObjectType = Image
NDims = 3
BinaryData = True
BinaryDataByteOrderMSB = False
CompressedData = True
CompressedDataSize = 88265161
TransformMatrix = 1 0 0 0 1 0 0 0 1
Offset = -211.621 -358.121 -99.4
CenterOfRotation = 0 0 0
AnatomicalOrientation = RAI
ElementSpacing = 0.757813 0.757813 1
DimSize = 512 512 264
ElementType = MET_SHORT
ElementDataFile = LKDS-00012.zraw

注释版：

ObjectType = Image 目标类型是图像
NDims = 3    三维数据
BinaryData = True    二进制数据
BinaryDataByteOrderMSB = False    不是big endian（Unix格式），是windows linux格式二进制
CompressedData = True    压缩数据
CompressedDataSize = 91327856    压缩数据大小
TransformMatrix = 1 0 0 0 1 0 0 0 1    传输矩阵100,010,001 x,y,z
Offset = -229 -131 -506.4    原点坐标
CenterOfRotation = 0 0 0    中心旋转：无
AnatomicalOrientation = RAI 患者体位
ElementSpacing = 0.894531 0.894531 1    像素间隔x,y,z
DimSize = 512 512 365    数据大小，x，y, z
ElementType = MET_SHORT    数据类型，16bit整数
ElementDataFile = LKDS-00010.zraw    数据存储的文件名

读取mhd文件可以参考以下这篇文章：
[天池医疗AI大赛[第一季]：肺部结节智能诊断] simpleITK读取mhd的demo(4行,zraw和mhd放一起)

import SimpleITK as sitk
file_name = 'E:\MHD\LKDS-00012.mhd'
img = sitk.ReadImage(file_name)
img_array = sitk.GetArrayFromImage(img)

目录最好是纯英文的，不然可能出现无法读取的问题

下面根据这篇文章尝试将结节图像提取出来（这里只是进行一个假象的固定结节的保存），每个结节的采样文件大概是3M大小。
[天池医疗AI大赛[第一季]：肺部结节智能诊断] 参赛新人向导（一）数据读取和可视化--数据长什么样（含代码）

def process_image(file_name,output_path,nodule):
    itk_img = sitk.ReadImage(file_name)
    # load the data once
    img_array = sitk.GetArrayFromImage(itk_img) # indexes are z,y,x (notice the ordering)
    num_z, height, width = img_array.shape      #heightXwidth constitute the transverse plane
    origin = np.array(itk_img.GetOrigin())      # x,y,z  Origin in world coordinates (mm)
    spacing = np.array(itk_img.GetSpacing())    # spacing of voxels in world coor. (mm)
    # go through all nodes (why just the biggest?)
    node_x = nodule.node_x
    node_y = nodule.node_y
    node_z = nodule.node_z
    diam =  nodule.diam
    # just keep 3 slices
    imgs = np.ndarray([3,height,width],dtype=np.float32)
    masks = np.ndarray([3,height,width],dtype=np.uint8)
    center = np.array([node_x, node_y, node_z])  # nodule center
    v_center = SITKlib.worldToVoxel(center,origin,spacing)  # nodule center in voxel space (still x,y,z ordering)
    for i, i_z in enumerate(np.arange(int(v_center[2])-1, int(v_center[2])+2).clip(0, num_z-1)): # clip prevents going out of bounds in Z
        mask = SITKlib.make_mask(center, diam, i_z*spacing[2]+origin[2],width, height, spacing, origin)
        masks[i] = mask
        imgs[i] = img_array[i_z]
    np.save(os.path.join(output_path,"images.npy"),imgs)
    np.save(os.path.join(output_path,"masks.npy"),masks)
    SITKlib.show_img(imgs,masks)

接下来分析一下每部分的代码

读图

    itk_img = sitk.ReadImage(file_name)
    # load the data once
    img_array = sitk.GetArrayFromImage(itk_img) # indexes are z,y,x (notice the ordering)
    num_z, height, width = img_array.shape      #height X width constitute the transverse plane

使用SimpleITK的ReadImage函数可以获得CT图像。这个图像应该不可以直接进行可视化操作。
然后将这个图像放入一个数组中保存起来，注意这里的顺序，Z，Y，X。

数据准备

origin：表示CT图像最边缘的坐标
sapcing：真实世界和像素的比例关系
然后我们假定有一个结节，其位置和半径由 node_x，node_y，node_z，diam决定

    origin = np.array(itk_img.GetOrigin())      # x,y,z  Origin in world coordinates (mm)
    spacing = np.array(itk_img.GetSpacing())    # spacing of voxels in world coor. (mm)
    # go through all nodes (why just the biggest?)
    node_x = nodule.node_x
    node_y = nodule.node_y
    node_z = nodule.node_z
    diam =  nodule.diam

真实世界坐标和CT像素坐标的转换

这里因为只保留和结节相关的3层，所以，z方向上固定为3.
然后比较核心的计算是这样的：
(center-origin)：结节的位置和图像边缘的位置做比较，就知道相差多少距离。
例如这里的Origin位置是：-211.621， -358.121， -99.4 （这个可以在mhd定义的地方看到）
结节的位置是：-100.56 ，67.26， -31.81
则相差的位置大约为 [ 111.061, 425.381, 67.59 ]
然后根据实际世界坐标和像素变换关系（ElementSpacing = 0.757813 ， 0.757813 ，1）
可以知道结节中心的位置所在的像素为：[ 147， 561， 68]。
这里请关注Z方向编号是68

    # just keep 3 slices
    imgs = np.ndarray([3,height,width],dtype=np.float32)
    masks = np.ndarray([3,height,width],dtype=np.uint8)
    center = np.array([node_x, node_y, node_z])  # nodule center
    v_center = SITKlib.worldToVoxel(center,origin,spacing)  # nodule center in voxel space (still x,y,z ordering)

def worldToVoxel(worldCoord, origin, spacing):
    '''
        结节世界坐标转成图像坐标
        worldCoord:结节世界坐标
        origin:边缘坐标
        spacing：比例尺
    '''
    voxelCoord = np.absolute(worldCoord - origin)
    voxelCoord = voxelCoord / spacing
    voxelCoord = voxelCoord.astype(np.int32)
    return voxelCoord

借用网络上的一张图说明问题：
原文：http://insightsoftwareconsortium.github.io/SimpleITK-Notebooks/03_Image_Details.html

坐标转换

保存图片

下面则将67，68，69，三个层面（每个层面是1个像素）的图像进行保存，整个层面的保存在"images.npy"里面。
结节部分的图片保存在"masks.npy"里面。

    for i, i_z in enumerate(np.arange(int(v_center[2])-1, int(v_center[2])+2).clip(0, num_z-1)): # clip prevents going out of bounds in Z
        mask = make_mask(center, diam, i_z*spacing[2]+origin[2],width, height, spacing, origin)
        masks[i] = mask
        imgs[i] = img_array[i_z]
    np.save(os.path.join(output_path,"images.npy"),imgs)
    np.save(os.path.join(output_path,"masks.npy"),masks)

可视化

彩色切片：原始图像
黑白切片：灰度图像
节点：将节点区域使用白色展示在黑色背景中，作为节点切片的Mask层
节点切片：仅将结节表示在图像中，并且进行灰度处理（为什么非节点地方不是黑色的？？）
** imgs[i]*masks[i] ，mask里面只有 1 （白色，结节）、0（黑色）两种值 **

    for i in range(len(imgs)):
        print ("图片的第 %d 层" % i)
        fig,ax = plt.subplots(2,2,figsize=[8,8])
        ax[0,0].imshow(imgs[i])
        ax[0,0].set_title(u'彩色切片')
        ax[0,1].imshow(imgs[i],cmap='gray')
        ax[0,1].set_title(u'黑白切片')
        ax[1,0].imshow(masks[i],cmap='gray')
        ax[1,0].set_title(u'节点')
        ax[1,1].imshow(imgs[i]*masks[i],cmap='gray')
        ax[1,1].set_title(u'节点切片')
        plt.show()
        print ('\n\n')

可视化

源代码（截至2017/04/28）

Solution.py

# coding: UTF-8
import os

import numpy as np
import SimpleITK as sitk
import matplotlib.pyplot as plt
import SITKlib

def process_image(file_name,output_path,nodule):
    itk_img = sitk.ReadImage(file_name)
    # load the data once
    img_array = sitk.GetArrayFromImage(itk_img) # indexes are z,y,x (notice the ordering)
    num_z, height, width = img_array.shape      #heightXwidth constitute the transverse plane
    origin = np.array(itk_img.GetOrigin())      # x,y,z  Origin in world coordinates (mm)
    spacing = np.array(itk_img.GetSpacing())    # spacing of voxels in world coor. (mm)
    # go through all nodes (why just the biggest?)
    node_x = nodule.node_x
    node_y = nodule.node_y
    node_z = nodule.node_z
    diam =  nodule.diam
    # just keep 3 slices
    imgs = np.ndarray([3,height,width],dtype=np.float32)
    masks = np.ndarray([3,height,width],dtype=np.uint8)
    center = np.array([node_x, node_y, node_z])  # nodule center
    v_center = SITKlib.worldToVoxel(center,origin,spacing)  # nodule center in voxel space (still x,y,z ordering)
    for i, i_z in enumerate(np.arange(int(v_center[2])-1, int(v_center[2])+2).clip(0, num_z-1)): # clip prevents going out of bounds in Z
        mask = SITKlib.make_mask(center, diam, i_z*spacing[2]+origin[2],width, height, spacing, origin)
        masks[i] = mask
        imgs[i] = img_array[i_z]
    np.save(os.path.join(output_path,"images.npy"),imgs)
    np.save(os.path.join(output_path,"masks.npy"),masks)
    SITKlib.show_img(imgs,masks)

def main():
    #Global Setting
    mhd_file_name = 'E:\\Huge CT Data\\train_subset00\\LKDS-00001.mhd'
    output_path = "E:\\Huge CT Data"
    nodule = SITKlib.Nodule();
    nodule.node_x = -76.4498793983
    nodule.node_y = -49.5405710363
    nodule.node_z = 229.5
    nodule.diam =  14.1804045239 
    process_image(mhd_file_name,output_path,nodule)

if __name__ == "__main__":
    main()

SITKlib.py

# coding: UTF-8
import os

import numpy as np
import SimpleITK as sitk
import matplotlib.pyplot as plt
class Nodule:
    def __init__(self):
        self.node_x = 0      
        self.node_y = 0      
        self.node_z = 0      
        self.diam = 0
def worldToVoxel(worldCoord, origin, spacing):
    '''
        结节世界坐标转成图像坐标
        worldCoord:结节世界坐标
        origin:边缘坐标
        spacing：比例尺
    '''
    voxelCoord = np.absolute(worldCoord - origin)
    voxelCoord = voxelCoord / spacing
    voxelCoord = voxelCoord.astype(np.int32)
    return voxelCoord

def show_img(imgs,masks):
    for i in range(len(imgs)):
        print ("图片的第 %d 层" % i)
        fig,ax = plt.subplots(2,2,figsize=[8,8])
        ax[0,0].imshow(imgs[i])
        ax[0,0].set_title(u'彩色切片')
        ax[0,1].imshow(imgs[i],cmap='gray')
        ax[0,1].set_title(u'黑白切片')
        ax[1,0].imshow(masks[i],cmap='gray')
        ax[1,0].set_title(u'节点')
        ax[1,1].imshow(imgs[i]*masks[i],cmap='gray')
        ax[1,1].set_title(u'节点切片')
        plt.show()
        print ('\n\n')
        #raw_input("hit enter to cont : ")

#只显示结节
def make_mask(center,diam,z,width,height,spacing,origin): 
    '''
        Center : 圆的中心 px -- list of coordinates x,y,z
        diam : 圆的直径 px -- diameter
        widthXheight : pixel dim of image
        spacing = mm/px conversion rate np array x,y,z
        origin = x,y,z mm np.array
        z = z position of slice in world coordinates mm
    '''
    mask = np.zeros([height,width])
    # 0's everywhere except nodule swapping x,y to match img
    #convert to nodule space from world coordinates

    # Defining the voxel range in which the nodule falls
    v_center = (center-origin)/spacing
    v_diam = int(diam/spacing[0]+5)
    v_xmin = np.max([0,int(v_center[0]-v_diam)-5])
    v_xmax = np.min([width-1,int(v_center[0]+v_diam)+5])
    v_ymin = np.max([0,int(v_center[1]-v_diam)-5]) 
    v_ymax = np.min([height-1,int(v_center[1]+v_diam)+5])

    v_xrange = range(v_xmin,v_xmax+1)
    v_yrange = range(v_ymin,v_ymax+1)

    # Convert back to world coordinates for distance calculation
    x_data = [x*spacing[0]+origin[0] for x in range(width)]
    y_data = [x*spacing[1]+origin[1] for x in range(height)]

    # Fill in 1 within sphere around nodule
    for v_x in v_xrange:
        for v_y in v_yrange:
            p_x = spacing[0]*v_x + origin[0]
            p_y = spacing[1]*v_y + origin[1]
            if np.linalg.norm(center-np.array([p_x,p_y,z]))<=diam:
                mask[int((p_y-origin[1])/spacing[1]),int((p_x-origin[0])/spacing[0])] = 1.0
    return(mask)

def matrix2int16(matrix):
    ''' 
        matrix must be a numpy array NXN
        Returns uint16 version
    '''
    m_min= np.min(matrix)
    m_max= np.max(matrix)
    matrix = matrix-m_min
    return(np.array(np.rint( (matrix-m_min)/float(m_max-m_min) * 65535.0),dtype=np.uint16))


#
# Helper function to get rows in data frame associated 
# with each file
def get_filename(file_list, case):
    for f in file_list:
        if case in f:
            return(f)
#
# The locations of the nodes


def normalize(image, MIN_BOUND=-1000.0, MAX_BOUND=400.0):
    """数据标准化"""
    image = (image - MIN_BOUND) / (MAX_BOUND - MIN_BOUND)
    image[image > 1] = 1.
    image[image < 0] = 0.
    return image
    #---数据标准化
    
    
def set_window_width(image, MIN_BOUND=-1000.0, MAX_BOUND=400.0):
    """设置窗宽"""
    image[image > MAX_BOUND] = MAX_BOUND
    image[image < MIN_BOUND] = MIN_BOUND
    return image
    #---设置窗宽

最后编辑于：2017.12.06 21:29:33

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 206,126评论 6赞 481
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 88,254评论 2赞 382
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 152,445评论 0赞 341
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 55,185评论 1赞 278
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 64,178评论 5赞 371
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,970评论 1赞 284
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,276评论 3赞 399
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,927评论 0赞 259
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 43,400评论 1赞 300
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,883评论 2赞 323
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,997评论 1赞 333
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,646评论 4赞 322
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,213评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 30,204评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,423评论 1赞 260
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 45,423评论 2赞 352
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,722评论 2赞 345