CMPT 365 Project Spring 2020Objective: Apply course materal to a more in-depth project. The Project can be in teamsof 2 (but as well can be individual). Each team gets a single score, and both members of theteam receive identical marks.For this project, the programming language and platform are open; the objective is tolearn about the problem, not to make a lovely GUI (at least, not necessarily — do so, if itaids you). It’s also open-ended in that there are many approaches to the problem that stemfrom the discussion below: e.g., the specification discusses video wipes, and we don’t look atvideo dissolves or other types of gradual transitions.1. Project1.1. STI by Copying PixelsWe are interested in finding and characterizing video transitions, specifically cuts and wipes.One approach to this problem is to construct a “spatio-temporal” image [STIbb] =an image which contains video content for each frame along the ordinate axis, versus timealong the abscissa. A simple version of such a construction consists of copying a column (orrow), or weighted average of a few, directly into the STI.E.g., suppose that we have a video containing a wipe, as in Fig. 1 below:Figure 1: Frames from a video that includes a horizontal wipe.Now suppose we copy the center column from each frame into a new, STI, image: the STIhas the same number of rows r as does the original video frame, but a number of columnsequal to the number of frames f in the video. The result is shown below in Fig. 2(a). If,instead, we copy over the center row from each frame, turned sideways, then the STI hasthe same number of rows as c, the number of columns in the original video, and a numberof columns f equal to the number of frames. Such an STI made up of rows is shown inFig. 2(b).For example, this video happens to be 120 × 160 pixels per frame, and the clip has 100frames. So the STI made from the center columns is of size 120 × 100. The STI made fromthe center rows is 160×100 pixels. Clearly, for the former a transition is evident at the timewhen the video wipe reaches the center column of the frame. Since the wipe is upright, theedge is also upright. For the latter STI, made from rows, the wipe shows up as a diagonaledge along the time direction.In general, one can make a taxonomy of such pairs of STIs based on whether the wipeis a vertical wipe, a horizontal wipe, an iris opening, etc. As well, one can use a diagonalacross the frame, rather than a column or row, to make a third kind of STI.1(a) (b)Figure 2: (a): STI consisting of center column from each frame; (b): And from center row.1.2. STI by Histogram DifferencesOne problem with the method outlined above is how to find that nice edge our eye sees soclearly in Fig. 2. For one thing, it’s often the case that the edge is not so visible, especiallyin noisy videos such as from broadcast TV. Also, if we rely on the pixels themselves, smallmovements can muddy the STI. For example, Fig. 3(b) shows an abrupt cut allright, andthen the flash of a flash-camera, but the diagonal edge indicating a wipe that follows is hardto see.(a) (b)Figure 3: (a): Frame during a wipe (broadcast video); (b): STI as in Fig. 2(b).One approach that is taken to characterize images is to use a histogram of colour, ratherthan just the raw pixel data. Histograms are fairly insensitive to troublesome problems suchas movement and occlusion (i.e., losing sight of an object over time).It turns out that we also know that if we replace the colour, RGB, by the chromaticity(see Chapter 4){r, g} = {R, G}/(R + G + B)then the image is much more characteristic of the surfaces being imaged, rather than of thelight illuminating those surfaces. So, instead of colour, {R, G, B}, let’s use chromaticity,{r, g}. [But watch out for black, i.e., {R, G, B}={0, 0, 0}, pixels.]The nice thing about chromaticity is that it’s 2D. So our histogram is a 2D array, withr along one axis and g along another. The chromaticity is necessarily in the interval [0, 1].But how many bins along each axis should we use? Applying (a cheap version of) a rule ofthumb called Sturges’s Rule, the number of bins N = 1 + log2(n), where n=size of data, soa rough idea is to use n = number of rows, so e.g. for frames of size 120 × 160 we woulduse N = floor(1 + log2(120)) = 7 bins along each of the r and g directions. That makes our2histogram, H, a small, 7 × 7 array of integer values. If we normalize to make the sum ofCMPT 365作业代做、代写programming作业、代做Java,c++编程语言作业、代写Python作业 代写P Hequal unity, then we have an array of floats.Now, suppose we wish to compare one column in a frame with the same column in theprevious frame. Then we should compare the histogram Ht at time t with the histogramHt−1 for the previous frame. Here, we make a 2D histogram Htjust for the particular columnwe’re working on, and the histogram Ht−1 for a histogram for the same column, but for theprevious frame. Let’s get a measure of histogram difference. One measure would be theEuclidean distance between all the entries in the histogram: the square root of the sum-ofsquaresof the differences between entries. But a better value turns out to be the so-called“histogram intersection”:min [Ht(i, j), Ht−1(i, j)]This formula assumes that one has first divided each histogram by its sum, so that each addsup to unity, e.g.,j Ht(i, j) = 1. How I works is that, for each array entry, you add upthe smaller of the values at that array location, over the two histograms being compared.Since we compared histograms for a single column, and obtained a scalar I, each columnin the whole frame can give us its own scalar, at the current time frame. So an STI madeout of I values could have a number of rows equal to the number of columns in the videoframe, and number of columns equal to the number of frames. E.g., in Fig. 4(a), we comparehistograms for columns, so the STI is 160 × 100, for our example video. If we do the samefor histograms of rows, then our STI is of size 120 × 100, as in Fig. 4(b).(a) (b) (c)Figure 4: (a): STI made from histogram intersections of each column in each frame with theprevious time instant; (b): And from rows. (c): A thresholded version of (a).How the histogram intersection works is that, if one histogram is similar to the other,then I ≃ 1. But if the histograms are different, then I ≃ 0. So if we are comparing a videoframe column to the same column at a previous instant, then we expect to have I about 1,and that’s what we see except at a wipe, where the current column is from one video butthe previous-time column is from another video.The advantage of doing all this is that the output is much cleaner. If we wish to finda diagonal edge, then Fig. 4(a) provides a fairly clean sets of zero values down a straight3diagonal, in a background of 1s. A “thresholded” version is shown in Fig. 4(c), where wedisplay the boolean value (I > τ ), with τ some high enough value (say, 0.7). Clearly,Fig. 4(c) provides a nice feature to search for in a video, if we wish to automatically findwipes.2. Your jobImplement these ideas and test on videos. If you’re really enthusiastic, try your code onvideo dissolves, as well. For faster processing, and also to remove some noise, the size ofinput frames can first be reduced to 32 × 32.If you’re truly over the top, then you’ll likely be saying to yourself that histogram intersection isnot the best one could do, in that to be compared, pixels have to be at roughly the same colour.But red is fairly like orange, so a reddish histogram bin should not give a zero intersection withan orange histogram bin.Therefore, IBM has developed a system that uses a different definition of histogram-difference.Suppose we reshape each histogram into a long vector (e.g., a 49-element vector). Denote byz the difference between histograms: z = Ht − Ht−1. Now the Euclidean distance betweenhistograms is the square root of zT z , where T means transpose.But instead, let’s interpose a matrix A , defining a new squared distance D2:D2 = zT A zwhere A encapsulates nearness of colours. A form suggested isAij = (1 − dij/dmax)with dij defined as a three-dimensional color difference and dmax = max(dij). E.g., we coulduse Euclidean distance between chromaticities {r, g}. Then dmax =√2.3. What to submit1. As a zipfile, submit code including documentation, along with a short (3-4 pages)written report summarizing what you have done for your project: What worked? Whatdidn’t? How would you like to extend your code? [And, what language did you use?Problems with libraries?]Not required, but as a different submission component it would be nice to include avideo URL as well, if possible.Your project will be marked on effort and usefulness of the resulting product.2. How hard was this project?State briefly how would you improve this project?Any ideas for a different project at the same difficulty level? Would you enjoy moredifficulty? Less?4转自:http://www.daixie0.com/contents/9/4901.html
讲解:CMPT 365、programming、Java,c++、PythonPython|Python
©著作权归作者所有,转载或内容合作请联系作者
- 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
- 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
- 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
推荐阅读更多精彩内容
- 本文转载自知乎 作者:季子乌 笔记版权归笔记作者所有 其中英文语句取自:英语流利说-懂你英语 ——————————...