TigerSHARC CLU Exploration of XCORRS for TakeHome Quiz 4 BIAWPQHI 13 April start of class - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

TigerSHARC CLU Exploration of XCORRS for TakeHome Quiz 4 BIAWPQHI 13 April start of class

Description:

parR, PRN32I, TEST_SIZE, resR, resI, &size3, false); CHECK(size3 == TEST_SIZE) ... ConvertC32_2_C1(parR, parI, PRNC1 1, size2); *size3 = size1 - size2; ... – PowerPoint PPT presentation

Number of Views:84

Avg rating:3.0/5.0

Slides: 30

Provided by: electr76

Category:

more less

Transcript and Presenter's Notes

Title: TigerSHARC CLU Exploration of XCORRS for TakeHome Quiz 4 BIAWPQHI 13 April start of class

1
TigerSHARC CLUExploration of XCORRS for
Take-Home Quiz 4BIAWPQHI -- 13 April start of
class

M. Smith,
University of Calgary, Canada
smithmr_at_ucalgary.ca

2
Ideal -- Take Home Quiz

Develop tests for complex correlation
Time and functionality
Evaluate on
C in default and optimized mode (especially
optimized)
Your optimized complex assembly code in complex
correlation in SID and SIMD modes
XCORRS in complex correlation in SID and SIMD
modes

3
Reasonable -- Take Home QuizCode and report

Develop Functionality and Time tests for real FIR
-- based on Lab. 3
Use on optimized C and your SISD and SIMD FIR
Develop Functionality and Time tests for real
correlation -- based on Lab. 3 / 4
Use on optimized C and your SISD and SIMD
correlation
Work out (theory) speed changes expected on your
SISD and SIMD if went to complex. Use as template
for expected changes in optimized C
Develop Functionality and Time tests for complex
FIR
Use on optimized C
Develop Functionality and Time tests for complex
correlation
Use on optimized C and your SISD and SIMD
XCORRS only
Report on whether changes in C code speed work
the way you expect
Use these figures to scale for FIR and
correlation to complex data
Report on relative speeds
C in default and optimized mode (especially
optimized)
Your optimized complex assembly code in complex
correlation in SID and SIMD modes
XCORRS in complex correlation in SID and SIMD
modes

4
Mark assignment

My tests and C are available on the web
If you use my tests, then you must say so, and
10 of marks are deducted
If you use my C code, then you must say so, and
10 of marks are deducted
If you use my C code and my test, then you must
say so, and 20 of marks are deducted

5
Speed comparison Part 1

Real FIR
float / int values , params
Loop
sum sum values params
2 memory fetches
1 add and 1 mult per loop cycle
done in ½ cycle in theory
Time N / 2 overhead
Determine overhead by measuring with and without
the loop-sum

Complex FIR
CMPX float / int values , params
Loop many common factors with FFT Hint for
final?
sum sum values params
Real sum v.re p.re v.im p.im
Imag sum v.re p.im v.im p.re
8 memory fetches
3 add / sub and 4 mult per loop
Time ??? overhead

6
Speed comparison Part 2

Speed in theory without doing anything special
Any special way to store complex values to speed
up memory access?
Do we need to do 8 memory fetches
On the Blackfin?
In the TigerSHARC?
Expected optimal speed?
Time ??? overhead

Complex FIR
CMPX float / int values , params
Loop many common factors with FFT Hint for
final?
sum sum values params
Real sum v.re p.re v.im p.im
Imag sum v.re p.im v.im p.re
8 memory fetches
3 add / sub and 4 mult per loop
Time ??? overhead

7
Speed comparison Part 3?

Do these speed calculations scale the same way
for complex correlation as for complex FIR?
Do a theory calculation and then compare result
for debug and optimized C code to validate
within 25 of predicted changes is probably more
than reasonable for a back-of-envelope
calculation
Use scaling factor on your real FIR and
correlation functions

8
Tests for following functions neededWhen convert
from float to int?

void ConvertReal2Complex(float , CMPX32 , int
size)
Make Complex Real j0
bool ConvertC32_2_C8(CMPX32 , CMPX8 , int
size)
Take bottom 8 bits of complex 32
Return false if overflows
Complex 8 is padded 2 complex in to 32 bits
--- int in format
bool ConvertC32_2_C1(CMPX32 , CMPX1 , int
size)
Take bottom 1 bits of complex 32
Return false if overflows, or if not -1
-j1 format
Complex 1 is padded 16 complex in to 32 bits
--- int in format
void ConvertC8_2_C32(CMPX8 , CMPX32 , int
size) needed? YES
um
void ConvertC1_2_C32(CMPX1 , CMPX32 , int
size) needed?

9
Tests for following functions needed

float RealFIR(float vals, float params, int
size, bool overhead)
CMPLX ComplexFIR(CMPLX vals, CMPLX params, int
size,
bool
overhead)vals in dm and params in pm
void RealCorrs(float vals, int size1, float
params, int size2, float
result, int size3, bool overhead)
void ComplexCorrs(CMPLX vals, int size1, CMPLX
params, int size2, CMPLX result, int size3,
bool overhead)
void XCORRS(CMPLX vals, int size1, CMPLX params,
int size2, CMPLX result, int size3, bool
overhead, int version)
version is 0 works, 1 SISD, 2 SIMD

10
Some hints

void XCORRS(CMPLX vals, int size1, CMPLX params,
int size2, CMPLX result, int size3, bool
overhead, version)
bool ConvertC32_2_C8(CMPX32 , dm CMPX8 ,
int size1)
bool ConvertC32_2_C1(CMPX32 ,pm CMPX1 ,
int size2)
size3 size1 size2
for result 1 to size 3
result 0
if (!overhead) XCORRS(dm CMPX8 , pm
CMPX1 , dm? Result, size1, size2, size 3,
whichversion

11
Some Hints

void ComplexCorrs(CMPLX vals, int size1, CMPLX
params, int size2, CMPLX result, int size3,
bool overhead)
if (overhead) return
size3 size1 size 2
for loop to size 3
resultloop ComplexFIR(vals, CMPLX
params, int size, bool overhead)
val
end loop

12
Some decisions

Complex 32 first decision
Store real in dm space and imaginary in pm space?
Complex8 in dm space, Complex1 in pm space
Doing everything with static pm variables
Using dm variables on stack, in an attempt to
avoid running out of memory
Try with satellite of size 2048 and PRN data of
size 1024 but suspect may not have enough room
when doing with Complex 32 so may have to test on
smaller for comparison
I ended up generating the same data as for
thexcorrs( ) shown last Friday size 48 16
3. Decided that if I could handle that (3 times
round xcorrs loop) then far enough test

13
Some Tests developed 1
TEST(ConvertReal2CMPLX32, D_TEST)
TEST_LEVEL(1) define TEST_SIZE 8 float
valuesTEST_SIZE 1.0, 2.0, 3.0, 4.0, 5.0,
6.0, 7.0, 8.0 float zerosTEST_SIZE 0, 0,
0, 0, 0, 0, 0, 0 ConvertReal2Complex(values,
C32Real, C32Imag, TEST_SIZE) ARRAYS_EQUAL(values
, C32Real, TEST_SIZE) ARRAYS_EQUAL(zeros,
C32Imag, TEST_SIZE)
14
Test for padded data C8 format
define TEST_SIZE 8 pm float imag1 TEST_SIZE
0x04, 0x14, -0x8, -0x18, 0x24, 0x34, 0x44,
0x54 float real1TEST_SIZE 0x08, 0x18, -1,
-2, 0x28, 0x38, 0x48, 0x58 TEST(ConvertToCMPLX
8, D_TEST) TEST_LEVEL(1) define TEST_SIZE
8 unsigned int result4 0x14180408,
0xE8FEF8FF, 0x34382428, 0x54584448 CHECK(!Conve
rtC32_2_C8(real1, imag1, DATAC8,
1)) CHECK(ConvertC32_2_C8(real1, imag1, DATAC8,
TEST_SIZE)) ARRAYS_EQUAL(DATAC8, result,
TEST_SIZE / 2)
15
Test for padded data C1 format
define LONGER_SIZE 32 pm float
imag2LONGER_SIZE 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, .. float
real2LONGER_SIZE 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, .. pm float
imag4LONGER_SIZE float real4LONGER_SIZE TE
ST(ConvertCMPLX1, D_TEST) TEST_LEVEL(1) unsi
gned int result12 0x00000000,
0x00000000 unsigned int result22
0xFFFFFFFF, 0xFFFFFFFF CHECK(!ConvertC32_2_C1
(real1, imag1, PRNC1, 1)) CHECK(!ConvertC32_2_C1
(real1, imag1, PRNC1, TEST_SIZE)) CHECK(!Convert
C32_2_C1(real2, imag2, PRNC1, 1)) CHECK(ConvertC
32_2_C1(real2, imag2, PRNC1, LONGER_SIZE)) ARRAY
S_EQUAL(PRNC1, result1, LONGER_SIZE / 16) for
(int i 0 i lt LONGER_SIZE i) real4i
-1 real2i imag4i -1
imag2i CHECK(ConvertC32_2_C1(real4, imag4,
PRNC1, LONGER_SIZE)) ARRAYS_EQUAL(PRNC1,
result2, LONGER_SIZE / 16)
16
RealFIR
define TEST_SIZE 8 pm float paramsTEST_SIZE
1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0,
8.0 TEST(RealFIR, D_TEST) TEST_LEVEL(1) flo
at impulseTEST_SIZE float resultsTEST_SIZE
for (int i 0 i lt TEST_SIZE i) for
(int j 0 j lt TEST_SIZE j) // Set to
zero impulsej 0 impulsei
1 resultsi RealFIR(impulse, params,
TEST_SIZE, false) ARRAYS_EQUAL(results,
params, TEST_SIZE)
17
Complex FIR tests (3 of them)To see if I got
both Real and Imag correct
pm float resultsITEST_SIZE TEST(ComplexFIR,
D_TEST) TEST_LEVEL(1) float
impulseTEST_SIZE float resultsRTEST_SIZE
float zerosTEST_SIZE 0, 0, 0, 0, 0, 0, 0,
0 for (int i 0 i lt TEST_SIZE i)
for (int j 0 j lt TEST_SIZE j) // Set to
zero impulsej 0 impulsei 1 for
(int j 0 j lt TEST_SIZE j) C32Realj
impulsej C32Imagj 0 C32Real1j
paramsj C32Imag1j 0 ComplexFI
R(C32Real, C32Imag, C32Real1, C32Imag1,
resultsRi,
resultsIi, TEST_SIZE, false) ARRAYS_EQUA
L(resultsR, params, TEST_SIZE) ARRAYS_EQUAL(resu
ltsI, zeros, TEST_SIZE)
18
Real Correlation
pm float PRN32ITEST_SIZE 1, -1, 1, -1, 1,
0, 0, 0 TEST(RealCorrelation, D_TEST)
TEST_LEVEL(1) float dataTEST_SIZE 2
0, 0, 0, 0, 1, -1, 1, -1, 1,
0, 0, 0, 0, 0, 0, 0 float
resultTEST_SIZE int IresultTEST_SIZE int
size3 RealCorrs(data, 2 TEST_SIZE, PRN32I,
TEST_SIZE, result, size3, false) CHECK(size3
TEST_SIZE) for (int j 0 j lt TEST_SIZE
j) Iresultj resultj CHECK(MaximumLocat
ion(Iresult, TEST_SIZE) 4)
19
Complex Correlation -- Simple Test
pm float dataITEST_SIZE 2 0, 0, 0, 0,
1.0, -1, 1, -1, 1, 0, 0, 0, 0, 0, 0, 0 pm
float resITEST_SIZE
TEST(ComplexCorrelation, D_TEST)
TEST_LEVEL(1) float dataRTEST_SIZE 2
0.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0 float resRTEST_SIZE int
IresultTEST_SIZE float parRTEST_SIZE 0,
0, 0, 0, 0, 0, 0, 0 int size3
ComplexCorrs(dataR, dataI, TEST_SIZE 2,
parR, PRN32I, TEST_SIZE,
resR, resI, size3, false)
CHECK(size3 TEST_SIZE) for (int
j 0 j lt TEST_SIZE j) Iresultj
abs(resRj) CHECK(MaximumLocation(Iresult,
TEST_SIZE) 4)
20
Complex Correlation related to results from last
lecture
for (int i 0 i lt 96 i 3)
satXCORRSRi -1 satXCORRSRi1 1
satXCORRSRi2 1 satXCORRSIi 0
satXCORRSIi1 0 satXCORRSIi2 0 for
(int i 0 i lt 48 i 3) prnXCORRSRi
-1 prnXCORRSRi1 1 prnXCORRSRi2 1
prnXCORRSIi -1 prnXCORRSIi1 1
prnXCORRSIi2 1 ComplexCorrs(satXCORRSR
, satXCORRSI, 96, prnXCORRSR, prnXCORRSI,
48, resXCORRSR,
resXCORRSI, size3, false)
CHECK(size3 48) for (int j 0 j lt 48
j) Iresultj abs(resXCORRSRj) for
(int j 1 j lt 45 j 3) CHECK(resXCORRSRj
-1 48) CHECK(resXCORRSRj
-16) CHECK(resXCORRSRj1
-16) CHECK(MaximumLocation(Iresult j, 48 -
j) 2)
21
Complex Correlation ASM related to results from
last lecture
for (int i 0 i lt 96 i 3)
satXCORRSRi -1 satXCORRSRi1 1
satXCORRSRi2 1 satXCORRSIi 0
satXCORRSIi1 0 satXCORRSIi2 0 for
(int i 0 i lt 48 i 3) prnXCORRSRi
-1 prnXCORRSRi1 1 prnXCORRSRi2 1
prnXCORRSIi -1 prnXCORRSIi1 1
prnXCORRSIi2 1
ComplexCorrsASM(satXCORRSR, satXCORRSI, 96,
prnXCORRSR, prnXCORRSI,
48, resXCORRSR, resXCORRSI,
size3, false) CHECK(size3
48) for (int j 0 j lt 48 j) Iresultj
abs(resXCORRSRj) for (int j 1 j lt 45 j
3) CHECK(resXCORRSRj-1
48) CHECK(resXCORRSRj -16) CHECK(resXCO
RRSRj1 -16) CHECK(MaximumLocation(Iresult
j, 48 - j) 2)
22
bool ConvertC32_2_C8(float inR, pm float inI,
unsigned int C8, int size) float holdR
inR pm float holdI inI for (int i
0 i lt size i) if ((inR gt 127)
(inR lt -128)) return false if ((inI gt 127)
(inI lt -128)) return false inR inI
// Not going to bother with things that
don't fit if (size 1) return false inR
holdR inI holdI for (int half 0
half lt size half 2) unsigned int first
( (int) inR) 0xFF unsigned int
second ( (int) inI) 0xFF unsigned
int third ( (int) inR) 0xFF
unsigned int fourth ( (int) inI) 0xFF
C8 ((((((fourth ltlt 8) third) ltlt 8)
second) ltlt 8) first) return
true
23
C8 ? C32 and C16 ? C32
float UINT8ToFloat(unsigned int value) if
(value 0x80) value value
0xFFFFFF00 return ( (int) value) else
return value void ConvertC8_2_C32(unsigned
int C8, float inR, pm float inI, int size)
for (int i 0 i lt size i 2)
unsigned int value C8 inR
UINT8ToFloat(value 0xFF) value gtgt 8
inI UINT8ToFloat(value 0xFF)
value gtgt 8 inR UINT8ToFloat(value
0xFF) value gtgt 8 inI
UINT8ToFloat(value 0xFF)
24
FIR filters
float RealFIR(float values, pm float params,
int size, bool overhead) if (overhead) return
0.0 float sum 0 for (int i 0 i lt size
i) sum values params return
sum pm float sumI 0 void ComplexFIR(float
valR, pm float valI, float parR, pm float
parI, float resultR, pm float resultI, int
size, bool overhead) if (overhead) resultR
resultI 0 return float sumR 0
sumI 0 // Was a static variable for (int i
0 i lt size i) sumR valR parR -
valI parI sumI valR parI valI
parR valR valI parR
parI resultR sumR resultI
sumI return
25
Correlation
void RealCorrs(float vals, int size1, pm float
params, int size2, float result, int size3,
bool overhead) if (overhead) return size3
size1 - size2 for (int j 0 j lt size2
j) result RealFIR(vals, params, size2,
overhead) void ComplexCorrs(float valR, pm
float valI, int size1, float
parR, pm float parI, int size2,
float resR, pm float resI, int size3, bool
overhead) if (overhead)
return size3 size1 - size2 for
(int j 0 j lt size2 j) ComplexFIR(valR,
valI, parR, parI, resRj, resIj, size2,
false)
26
Correlation XCORRS
extern "C" void xcorrsfunc(unsigned int C8, pm
unsigned int C1, unsigned int C16, int size)
void ComplexXCORRS(float valR, pm float valI,
int size1, float parR, pm
float parI, int size2, float
resR, pm float resI, int size3, bool overhead)
ConvertC32_2_C8(valR,
valI, DATAC8, size1) PRNC1 0x0 // Need to
shift hte PPRN to location C15 ConvertC32_2_C1(pa
rR, parI, PRNC1 1, size2) size3 size1 -
size2 if (!overhead) xcorrsfunc(DATAC8, PRNC1,
RESULTC16, size3) ConvertC16_2_C32(RESULTC16,
resR, resI, size3)
27
XCORRS same code as beforeexcept need to
transfer results out
// Shift out the values in TR registers into
results xR30 TR30 QJ6 4
xR30 xR30 TR74 QJ6 4
xR30 xR30 TR118 QJ6 4
xR30 xR30 TR1512 QJ6 4
xR30 IF NLC0E, JUMP OUTERLOOP
28
Need to get inpars and go round more than 16
times
J0 zeros // Clear the THR registers the hard
way R30 QJ0 4 THR30 R30 R74
R30 // K0 prn J2 J4
// satellite_data LC0 3 OUTERLOOP K0
J5 J2 J4 J4 J4 8 // Increment by 8
and not 16 REST OF CODE UNCHANGED // Load THR
with PRN code R10 LK0 2 THR10
R10 R10 LK0 2 THR32 R10
29
Test results

Write a Comment

User Comments (0)